Background

The file ‘WeatherEDA_202005_v002.Rmd’ contains exploratory data analysis for historical weather data as contained in METAR archives hosted by Iowa State University.

Data have been dowloaded, processed, cleaned, and integrated for several stations (airports) and years, with .rds files saved in “./RInputFiles/ProcessedMETAR”.

This module will perform initial modeling on the processed weather files. It builds on the previous ‘WeatherModeling_202006_v001.Rmd’ and ‘WeatherModeling_202006_v002.Rmd’ as well as leveraging functions in ‘WeatherModelingFunctions_v001.R’.

This file focuses on:

  1. Random forest classification of select locales using 2014-2019 data
  2. Random forest classification of locales in 2016
  3. Random forest regression of temperatures in select locales for 2014-2019
  4. XGB regression of temperatures in select locales for 2014-2019
  5. XGB classification of locales in 2016

There are numerous other models available in ‘WeatherModeling_202006_v002.Rmd’.

Data Availability

There are three main processed files available for further exploration:

metar_postEDA_20200617.rds and metar_postEDA_extra_20200627.rds

  • source (chr) - the reporting station and time
  • locale (chr) - the descriptive name for source
  • dtime (dttm) - the date-time for the observation
  • origMETAR (chr) - the original METAR associated with the observation at that source and date-time
  • year (dbl) - the year, extracted from dtime
  • monthint (dbl) - the month, extracted from dtime, as an integer
  • month (fct) - the month, extracted from dtime, as a three-character abbreviation (factor)
  • day (int) - the day of the month, extracted from dtime
  • WindDir (chr) - previaling wind direction in degrees, stored as a character since ‘VRB’ means variable
  • WindSpeed (int) - the prevailing wind speed in knots
  • WindGust (dbl) - the wind gust speed in knots (NA if there is no recorded wind gust at that hour)
  • predomDir (chr) - the predominant wind direction as NE-E-SE-S-SW-W-NW-N-VRB-000-Error
  • Visibility (dbl) - surface visibility in statute miles
  • Altimeter (dbl) - altimeter in inches of mercury
  • TempF (dbl) - the Fahrenheit temperature
  • DewF (dbl) - the Fahrenheit dew point
  • modSLP (dbl) - Sea-Level Pressure (SLP), adjusted to reflect that SLP is recorded as 0-1000 but reflects data that are 950-1050
  • cTypen (chr) - the cloud type of the nth cloud layer (FEW, BKN, SCT, OVC, or VV)
  • cLeveln (dbl) - the cloud height in feet of the nth cloud layer
  • isRain (lgl) - was rain occurring at the moment the METAR was captured?
  • isSnow (lgl) - was snow occurring at the moment the METAR was captured?
  • isThunder (lgl) - was thunder occurring at the moment the METAR was captured?
  • p1Inches (dbl) - how many inches of rain occurred in the past hour?
  • p36Inches (dbl) - how many inches of rain occurred in the past 3/6 hours (3-hour summaries at 3Z-9Z-15Z-21Z and 6-hour summaries at 6Z-12Z-18Z-24Z and NA at any other Z times)?
  • p24Inches (dbl) - how many inches of rain occurred in the past 24 hours (at 12Z, NA at all other times)
  • tempFHi (dbl) - the high temperature in the past 24 hours, in Fahrenheit (reported once per day)
  • tempFLo (dbl) - the low temperature in the past 24 hours, in Fahrenheit (reported once per day)
  • minHeight (dbl) - the minimum cloud height in feet (-100 means ‘no clouds’)
  • minType (fct) - amount of obscuration at the minimum cloud height (VV > OVC > BKN > SCT > FEW > CLR)
  • ceilingHeight (dbl) - the minimum cloud ceiling in feet (-100 means ‘no ceiling’)
  • ceilingType (fct) - amount of obscuration at the minimum ceiling height (VV > OVC > BKN)

metar_modifiedClouds_20200617.rds and metar_modifiedclouds_extra_20200627.rds

  • source (chr) - the reporting station and time
  • sourceName (chr) - the descriptive name for source
  • dtime (dttm) - the date-time for the observation
  • level (dbl) - cloud level (level 0 is inserted for every source-dtime as a base layer of clear)
  • height (dbl) - level height (height -100 is inserted for every source-dtime as a base layer of clear)
  • type (dbl) - level type (type CLR is inserted for every source-dtime as a base layer of clear)

metar_precipLists_20200617.rds and metar_precipLists_extra_20200627.rds

  • Contains elements for each of rain/snow/thunder for each of 2015/2016/2017
  • Each element contains a list and a tibble
  • The tibble is precipLength and contains precipitation by month as source-locale-month-hours-events
  • The list is precipList and contains data on each precipitation interval

Several mapping files are defined for use in plotting; tidyverse, lubridate, and caret are loaded; and the relevant functions are sourced:

# The process frequently uses tidyverse, lubridate, caret, and randomForest
library(tidyverse)
## -- Attaching packages --------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1     v purrr   0.3.3
## v tibble  2.1.3     v dplyr   0.8.4
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## -- Conflicts ------------------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
## 
##     date
library(caret)
## Loading required package: lattice
## 
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
## 
##     lift
library(randomForest)
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
## 
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
## 
##     combine
## The following object is masked from 'package:ggplot2':
## 
##     margin
# The main path for the files
filePath <- "./RInputFiles/ProcessedMETAR/"


# Sourcing functions
source("./WeatherModelingFunctions_v001.R")


# Descriptive names for key variables
varMapper <- c(source="Original source file name", 
               locale="Descriptive name",
               dtime="Date-Time (UTC)",
               origMETAR="Original METAR",
               year="Year",
               monthint="Month",
               month="Month", 
               day="Day of Month",
               WindDir="Wind Direction (degrees)", 
               WindSpeed="Wind Speed (kts)",
               WindGust="Wind Gust (kts)",
               predomDir="General Prevailing Wind Direction",
               Visibility="Visibility (SM)", 
               Altimeter="Altimeter (inches Hg)",
               TempF="Temperature (F)",
               DewF="Dew Point (F)", 
               modSLP="Sea-Level Pressure (hPa)", 
               cType1="First Cloud Layer Type", 
               cLevel1="First Cloud Layer Height (ft)",
               isRain="Rain at Observation Time",
               isSnow="Snow at Observation Time",
               isThunder="Thunder at Obsevation Time",
               tempFHi="24-hour High Temperature (F)",
               tempFLo="24-hour Low Temperature (F)",
               minHeight="Minimum Cloud Height (ft)",
               minType="Obscuration Level at Minimum Cloud Height",
               ceilingHeight="Minimum Ceiling Height (ft)",
               ceilingType="Obscuration Level at Minimum Ceiling Height", 
               hr="Hour of Day (Zulu time)",
               hrfct="Hour of Day (Zulu time)",
               hrBucket="Hour of Day (Zulu time) - rounded to nearest 3",
               locNamefct="Locale Name"
               )


# File name to city name mapper
cityNameMapper <- c(katl_2016="Atlanta, GA (2016)",
                    kbos_2016="Boston, MA (2016)", 
                    kdca_2016="Washington, DC (2016)", 
                    kden_2016="Denver, CO (2016)", 
                    kdfw_2016="Dallas, TX (2016)", 
                    kdtw_2016="Detroit, MI (2016)", 
                    kewr_2016="Newark, NJ (2016)",
                    kgrb_2016="Green Bay, WI (2016)",
                    kgrr_2016="Grand Rapids, MI (2016)",
                    kiah_2016="Houston, TX (2016)",
                    kind_2016="Indianapolis, IN (2016)",
                    klas_2014="Las Vegas, NV (2014)",
                    klas_2015="Las Vegas, NV (2015)",
                    klas_2016="Las Vegas, NV (2016)", 
                    klas_2017="Las Vegas, NV (2017)", 
                    klas_2018="Las Vegas, NV (2018)",
                    klas_2019="Las Vegas, NV (2019)",
                    klax_2016="Los Angeles, CA (2016)", 
                    klnk_2016="Lincoln, NE (2016)",
                    kmia_2016="Miami, FL (2016)", 
                    kmke_2016="Milwaukee, WI (2016)",
                    kmsn_2016="Madison, WI (2016)",
                    kmsp_2016="Minneapolis, MN (2016)",
                    kmsy_2014="New Orleans, LA (2014)",
                    kmsy_2015="New Orleans, LA (2015)",
                    kmsy_2016="New Orleans, LA (2016)", 
                    kmsy_2017="New Orleans, LA (2017)", 
                    kmsy_2018="New Orleans, LA (2018)",
                    kmsy_2019="New Orleans, LA (2019)",
                    kord_2014="Chicago, IL (2014)",
                    kord_2015="Chicago, IL (2015)",
                    kord_2016="Chicago, IL (2016)", 
                    kord_2017="Chicago, IL (2017)", 
                    kord_2018="Chicago, IL (2018)",
                    kord_2019="Chicago, IL (2019)",
                    kphl_2016="Philadelphia, PA (2016)", 
                    kphx_2016="Phoenix, AZ (2016)", 
                    ksan_2014="San Diego, CA (2014)",
                    ksan_2015="San Diego, CA (2015)",
                    ksan_2016="San Diego, CA (2016)",
                    ksan_2017="San Diego, CA (2017)",
                    ksan_2018="San Diego, CA (2018)",
                    ksan_2019="San Diego, CA (2019)",
                    ksat_2016="San Antonio, TX (2016)", 
                    ksea_2016="Seattle, WA (2016)", 
                    ksfo_2016="San Francisco, CA (2016)", 
                    ksjc_2016="San Jose, CA (2016)",
                    kstl_2016="Saint Louis, MO (2016)", 
                    ktpa_2016="Tampa Bay, FL (2016)", 
                    ktvc_2016="Traverse City, MI (2016)"
                    )

# File names in 2016, based on cityNameMapper
names_2016 <- grep(names(cityNameMapper), pattern="[a-z]{3}_2016", value=TRUE)

The main data will be from the metar_postEDA files. They are integrated below, cloud and ceiling heights are converted to factors, hour is added as both a factor/numeric variable, and locale is added as a factor variable:

# Main weather data
metarData <- readRDS("./RInputFiles/ProcessedMETAR/metar_postEDA_20200617.rds") %>%
    bind_rows(readRDS("./RInputFiles/ProcessedMETAR/metar_postEDA_extra_20200627.rds")) %>%
    mutate(orig_minHeight=minHeight, 
           orig_ceilingHeight=ceilingHeight, 
           minHeight=mapCloudHeight(minHeight), 
           ceilingHeight=mapCloudHeight(ceilingHeight), 
           hr=lubridate::hour(lubridate::round_date(dtime, unit="1 hour")),
           hrfct=factor(hr), 
           locNamefct=factor(str_replace(locale, pattern=" \\(\\d{4}\\)", replacement=""))
           )
glimpse(metarData)
## Observations: 437,417
## Variables: 46
## $ source             <chr> "kdtw_2016", "kdtw_2016", "kdtw_2016", "kdtw_201...
## $ locale             <chr> "Detroit, MI (2016)", "Detroit, MI (2016)", "Det...
## $ dtime              <dttm> 2016-01-01 00:53:00, 2016-01-01 01:53:00, 2016-...
## $ origMETAR          <chr> "KDTW 010053Z 23012KT 10SM OVC020 00/M05 A3021 R...
## $ year               <dbl> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, ...
## $ monthint           <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ month              <fct> Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan...
## $ day                <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ WindDir            <chr> "230", "230", "230", "240", "230", "220", "220",...
## $ WindSpeed          <int> 12, 12, 11, 14, 16, 13, 14, 16, 13, 16, 17, 13, ...
## $ WindGust           <dbl> NA, NA, NA, 23, 22, NA, 20, 20, NA, 22, NA, NA, ...
## $ predomDir          <fct> SW, SW, SW, SW, SW, SW, SW, SW, SW, SW, SW, SW, ...
## $ Visibility         <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 8, 5, 7, 8, 10, ...
## $ Altimeter          <dbl> 30.21, 30.21, 30.19, 30.19, 30.18, 30.16, 30.14,...
## $ TempF              <dbl> 32.00, 32.00, 32.00, 30.92, 30.92, 32.00, 30.92,...
## $ DewF               <dbl> 23.00, 21.92, 21.02, 19.94, 19.94, 19.94, 19.94,...
## $ modSLP             <dbl> 1023.6, 1023.5, 1023.0, 1023.0, 1022.7, 1022.0, ...
## $ cType1             <chr> "OVC", "OVC", "OVC", "OVC", "OVC", "OVC", "OVC",...
## $ cType2             <chr> "", "", "", "", "", "", "", "", "", "OVC", "OVC"...
## $ cType3             <chr> "", "", "", "", "", "", "", "", "", "", "", "", ...
## $ cType4             <chr> "", "", "", "", "", "", "", "", "", "", "", "", ...
## $ cType5             <chr> "", "", "", "", "", "", "", "", "", "", "", "", ...
## $ cType6             <chr> "", "", "", "", "", "", "", "", "", "", "", "", ...
## $ cLevel1            <dbl> 2000, 2000, 2200, 2500, 2700, 2500, 2500, 2500, ...
## $ cLevel2            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 4500, 4000, ...
## $ cLevel3            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ cLevel4            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ cLevel5            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ cLevel6            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ isRain             <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ isSnow             <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ isThunder          <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ p1Inches           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, N...
## $ p36Inches          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, NA, 0, NA...
## $ p24Inches          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ tempFHi            <dbl> NA, NA, NA, NA, 36, NA, NA, NA, NA, NA, NA, NA, ...
## $ tempFLo            <dbl> NA, NA, NA, NA, 31, NA, NA, NA, NA, NA, NA, NA, ...
## $ minHeight          <fct> Low, Low, Low, Low, Low, Low, Low, Low, Low, Low...
## $ minType            <fct> OVC, OVC, OVC, OVC, OVC, OVC, OVC, OVC, OVC, BKN...
## $ ceilingHeight      <fct> Low, Low, Low, Low, Low, Low, Low, Low, Low, Low...
## $ ceilingType        <fct> OVC, OVC, OVC, OVC, OVC, OVC, OVC, OVC, OVC, BKN...
## $ orig_minHeight     <dbl> 2000, 2000, 2200, 2500, 2700, 2500, 2500, 2500, ...
## $ orig_ceilingHeight <dbl> 2000, 2000, 2200, 2500, 2700, 2500, 2500, 2500, ...
## $ hr                 <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1...
## $ hrfct              <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1...
## $ locNamefct         <fct> "Detroit, MI", "Detroit, MI", "Detroit, MI", "De...

Random Forest Classification (Select Locales for 2014-2019)

Models are run on all 2014-2019 data for Chicago, Las Vegas, New Orleans, and San Diego:

# Create the subset for Chicago, Las Vegas, New Orleans, San Diego
sub_2014_2019_data <- metarData %>%
    filter(str_sub(source, 1, 4) %in% c("kord", "klas", "kmsy", "ksan"), 
           year %in% c(2014, 2015, 2016, 2017, 2018, 2019)
           ) %>%
    mutate(city=str_replace(locale, pattern=" .\\d{4}.", replacement=""), 
           hr=lubridate::hour(dtime)
           )

# Check that proper locales are included
sub_2014_2019_data %>% 
    count(city, locale)
## # A tibble: 24 x 3
##    city          locale                   n
##    <chr>         <chr>                <int>
##  1 Chicago, IL   Chicago, IL (2014)    8718
##  2 Chicago, IL   Chicago, IL (2015)    8728
##  3 Chicago, IL   Chicago, IL (2016)    8767
##  4 Chicago, IL   Chicago, IL (2017)    8740
##  5 Chicago, IL   Chicago, IL (2018)    8737
##  6 Chicago, IL   Chicago, IL (2019)    8750
##  7 Las Vegas, NV Las Vegas, NV (2014)  8739
##  8 Las Vegas, NV Las Vegas, NV (2015)  8727
##  9 Las Vegas, NV Las Vegas, NV (2016)  8770
## 10 Las Vegas, NV Las Vegas, NV (2017)  8664
## # ... with 14 more rows

The random forest model is run and cached:

# Run random forest for 2014-2019 data
rf_types_2014_2019_TDmcwha <- rfMultiLocale(sub_2014_2019_data, 
                                            vrbls=c("TempF", "DewF", 
                                                    "month", "hr",
                                                    "minHeight", "ceilingHeight", 
                                                    "WindSpeed", "predomDir", 
                                                    "modSLP"
                                                    ),
                                            locs=NULL, 
                                            locVar="city",
                                            pred="city",
                                            ntree=50, 
                                            seed=2006301420, 
                                            mtry=4
                                            )
## 
## Running for locations:
## [1] "Chicago, IL"     "Las Vegas, NV"   "New Orleans, LA" "San Diego, CA"
evalPredictions(rf_types_2014_2019_TDmcwha, 
                plotCaption = "Temp, Dew Point, Month, Hour of Day, Cloud Height, Wind, SLP", 
                keyVar="city"
                )

## # A tibble: 16 x 5
##    locale          predicted       correct     n     pct
##    <fct>           <fct>           <lgl>   <int>   <dbl>
##  1 Chicago, IL     Chicago, IL     TRUE    14700 0.939  
##  2 Chicago, IL     Las Vegas, NV   FALSE     243 0.0155 
##  3 Chicago, IL     New Orleans, LA FALSE     382 0.0244 
##  4 Chicago, IL     San Diego, CA   FALSE     329 0.0210 
##  5 Las Vegas, NV   Chicago, IL     FALSE     237 0.0153 
##  6 Las Vegas, NV   Las Vegas, NV   TRUE    14878 0.958  
##  7 Las Vegas, NV   New Orleans, LA FALSE     130 0.00837
##  8 Las Vegas, NV   San Diego, CA   FALSE     288 0.0185 
##  9 New Orleans, LA Chicago, IL     FALSE     383 0.0242 
## 10 New Orleans, LA Las Vegas, NV   FALSE     137 0.00865
## 11 New Orleans, LA New Orleans, LA TRUE    14746 0.931  
## 12 New Orleans, LA San Diego, CA   FALSE     567 0.0358 
## 13 San Diego, CA   Chicago, IL     FALSE     248 0.0157 
## 14 San Diego, CA   Las Vegas, NV   FALSE     454 0.0288 
## 15 San Diego, CA   New Orleans, LA FALSE     334 0.0212 
## 16 San Diego, CA   San Diego, CA   TRUE    14739 0.934

Even with a small forest (50 trees), the model is almost always separating Las Vegas, Chicago, San Diego, and New Orleans. While the climates are very different in these cities, it is striking that the model has so few misclassifications.

How do other cities map against these classifications?

# Predictions on 2014-2019 data
helperPredictPlot(rf_types_2014_2019_TDmcwha$rfModel, 
                  df=filter(mutate(metarData, hr=lubridate::hour(dtime)), 
                            !(str_sub(source, 1, 4) %in% c("kord", "klas", "kmsy", "ksan"))
                            ), 
                  predOrder=c("Chicago, IL", "San Diego, CA", "New Orleans, LA", "Las Vegas, NV")
                  )

## # A tibble: 104 x 4
##    locale             predicted           n    pct
##    <chr>              <fct>           <int>  <dbl>
##  1 Atlanta, GA (2016) Chicago, IL      3025 0.346 
##  2 Atlanta, GA (2016) Las Vegas, NV     795 0.0908
##  3 Atlanta, GA (2016) New Orleans, LA  4108 0.469 
##  4 Atlanta, GA (2016) San Diego, CA     823 0.0940
##  5 Boston, MA (2016)  Chicago, IL      7619 0.880 
##  6 Boston, MA (2016)  Las Vegas, NV     459 0.0530
##  7 Boston, MA (2016)  New Orleans, LA   337 0.0389
##  8 Boston, MA (2016)  San Diego, CA     239 0.0276
##  9 Dallas, TX (2016)  Chicago, IL      2708 0.310 
## 10 Dallas, TX (2016)  Las Vegas, NV    1134 0.130 
## # ... with 94 more rows

Classifications are broadly as expected based on climates by locale. Variable importances are plotted:

helperPlotVarImp(rf_types_2014_2019_TDmcwha$rfModel)

Dew point and temperature are strong factors for separating the four cities in this analysis. Month, SLP, minimum cloud height, and prevailing wind direction are also meaningful.

An assessment can be run for the 2014-2019 model:

# Run for the full model including SLP
probs_2014_2019_TDmcwha <- 
    assessPredictionCertainty(rf_types_2014_2019_TDmcwha, 
                              keyVar="city", 
                              plotCaption="Temp, Dew Point, Month/Hour, Clouds, Wind, SLP", 
                              showAcc=TRUE
                              )
## Note: Using an external vector in selections is ambiguous.
## i Use `all_of(keyVar)` instead of `keyVar` to silence this message.
## i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
## Note: Using an external vector in selections is ambiguous.
## i Use `all_of(pkVars)` instead of `pkVars` to silence this message.
## i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.

  • Predictions with 80%+ of the votes are made ~75% of the time, and these predictions are ~99% accurate
  • Predictions with <80% of the votes are made ~25% of the times, and these predictions are ~80% accurate
  • The percentage of votes received appears to be a reasonable proxy for the confidence of the prediction

A similar process can be run for assessing the classification of the other cities against the 2014-2019 data for Chicago, Las Vegas, New Orleans, and San Diego:

useData <- metarData %>%
    filter(!(str_sub(source, 1, 4) %in% c("kord", "klas", "kmsy", "ksan"))) %>%
    mutate(hr=lubridate::hour(dtime))
    
# Run for the model excluding SLP
probs_allcities_2014_2019_TDmcwh <- 
    assessPredictionCertainty(rf_types_2014_2019_TDmcwha, 
                              testData=useData,
                              keyVar="locale", 
                              plotCaption="Temp, Dew Point, Month/Hour, Clouds, Wind, modSLP", 
                              showHists=TRUE
                              )
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The model is frequently not so confident in assigning an archetype to related cities, though it frequently gets the most sensible assignment.

Random Forest Classification (2016)

Next, an attempt is made to compare every grouping of two cities, using all variables, mtry of 4, and a very small forest of 15 trees:

# Create a container list to hold the output
list_varimp_2016 <- vector("list", 0.5*length(names_2016)*(length(names_2016)-1))

# Set a random seed
set.seed(2007031342)

# Loop through all possible combinations
n <- 1
for (ctr in 1:(length(names_2016)-1)) {
    for (ctr2 in (ctr+1):length(names_2016)) {
        list_varimp_2016[[n]] <- rfTwoLocales(mutate(metarData, hr=lubridate::hour(dtime)), 
                                              loc1=names_2016[ctr], 
                                              loc2=names_2016[ctr2], 
                                              vrbls=c("TempF", "DewF", 
                                                      "month", "hr",
                                                      "minHeight", "ceilingHeight", 
                                                      "WindSpeed", "predomDir", 
                                                      "modSLP", "Altimeter"
                                                      ),
                                              ntree=15, 
                                              mtry=4
                                              )
        n <- n + 1
        if ((n %% 40) == 0) { cat("Through number:", n, "\n")}
    }
}
## Through number: 40 
## Through number: 80 
## Through number: 120 
## Through number: 160 
## Through number: 200 
## Through number: 240 
## Through number: 280 
## Through number: 320 
## Through number: 360 
## Through number: 400
# Create a tibble from the underlying accuracy data
acc_varimp_2016 <- map_dfr(list_varimp_2016, .f=helperAccuracyLocale)

# Assess the top 20 classification accuracies
acc_varimp_2016 %>%
    arrange(-accOverall) %>%
    head(20)
## # A tibble: 20 x 5
##    locale1                locale2               accOverall accLocale1 accLocale2
##    <chr>                  <chr>                      <dbl>      <dbl>      <dbl>
##  1 Denver, CO (2016)      Miami, FL (2016)           0.998      0.998      0.998
##  2 Denver, CO (2016)      Tampa Bay, FL (2016)       0.996      0.998      0.995
##  3 Las Vegas, NV (2016)   Miami, FL (2016)           0.995      0.995      0.995
##  4 Denver, CO (2016)      New Orleans, LA (201~      0.995      0.996      0.994
##  5 Denver, CO (2016)      San Diego, CA (2016)       0.994      0.994      0.994
##  6 Denver, CO (2016)      Houston, TX (2016)         0.993      0.994      0.992
##  7 Denver, CO (2016)      San Francisco, CA (2~      0.993      0.994      0.992
##  8 Miami, FL (2016)       Seattle, WA (2016)         0.992      0.990      0.994
##  9 Miami, FL (2016)       Phoenix, AZ (2016)         0.992      0.991      0.992
## 10 Miami, FL (2016)       Traverse City, MI (2~      0.991      0.993      0.990
## 11 Miami, FL (2016)       Minneapolis, MN (201~      0.991      0.991      0.990
## 12 Boston, MA (2016)      Miami, FL (2016)           0.991      0.990      0.991
## 13 Denver, CO (2016)      Los Angeles, CA (201~      0.991      0.993      0.989
## 14 Denver, CO (2016)      San Jose, CA (2016)        0.990      0.991      0.989
## 15 Denver, CO (2016)      San Antonio, TX (201~      0.990      0.992      0.988
## 16 Grand Rapids, MI (201~ Miami, FL (2016)           0.990      0.990      0.990
## 17 Green Bay, WI (2016)   Miami, FL (2016)           0.990      0.989      0.990
## 18 Miami, FL (2016)       Milwaukee, WI (2016)       0.989      0.989      0.989
## 19 Madison, WI (2016)     Miami, FL (2016)           0.989      0.986      0.992
## 20 Las Vegas, NV (2016)   Tampa Bay, FL (2016)       0.989      0.990      0.987
# Assess the bottom 20 classification accuracies
acc_varimp_2016 %>%
    arrange(accOverall) %>%
    head(20)
## # A tibble: 20 x 5
##    locale1                locale2               accOverall accLocale1 accLocale2
##    <chr>                  <chr>                      <dbl>      <dbl>      <dbl>
##  1 Chicago, IL (2016)     Milwaukee, WI (2016)       0.675      0.685      0.666
##  2 Newark, NJ (2016)      Philadelphia, PA (20~      0.676      0.687      0.665
##  3 Detroit, MI (2016)     Grand Rapids, MI (20~      0.712      0.718      0.706
##  4 Madison, WI (2016)     Milwaukee, WI (2016)       0.718      0.725      0.711
##  5 Philadelphia, PA (201~ Washington, DC (2016)      0.730      0.741      0.718
##  6 Chicago, IL (2016)     Grand Rapids, MI (20~      0.731      0.750      0.711
##  7 Green Bay, WI (2016)   Madison, WI (2016)         0.737      0.747      0.728
##  8 Chicago, IL (2016)     Detroit, MI (2016)         0.740      0.746      0.735
##  9 Chicago, IL (2016)     Madison, WI (2016)         0.745      0.756      0.733
## 10 Grand Rapids, MI (201~ Milwaukee, WI (2016)       0.746      0.747      0.744
## 11 Green Bay, WI (2016)   Milwaukee, WI (2016)       0.754      0.757      0.751
## 12 Madison, WI (2016)     Minneapolis, MN (201~      0.758      0.759      0.757
## 13 Grand Rapids, MI (201~ Madison, WI (2016)         0.759      0.763      0.755
## 14 Detroit, MI (2016)     Milwaukee, WI (2016)       0.765      0.777      0.753
## 15 Grand Rapids, MI (201~ Traverse City, MI (2~      0.769      0.770      0.768
## 16 Detroit, MI (2016)     Indianapolis, IN (20~      0.772      0.775      0.770
## 17 Chicago, IL (2016)     Indianapolis, IN (20~      0.775      0.777      0.774
## 18 Boston, MA (2016)      Newark, NJ (2016)          0.778      0.781      0.776
## 19 Indianapolis, IN (201~ Saint Louis, MO (201~      0.779      0.791      0.768
## 20 Madison, WI (2016)     Traverse City, MI (2~      0.783      0.782      0.784

The best accuracies are obtained when comparing cities in very different climates (e.g., Denver vs. Humid/Marine or Miami vs. Desert/Cold), while the worst accuracies are obtained when comparing very similar cities (e.g., Chicago and Milwaukee or Newar and Philadelphia).

Variable importance can then be assessed across all 1:1 classifications:

# Create a tibble of all the variable importance data
val_varimp_2016 <- map_dfr(list_varimp_2016, 
                           .f=function(x) { x$rfModel %>% 
                                   caret::varImp() %>% 
                                   t() %>% 
                                   as.data.frame()
                               }
                           ) %>% 
    tibble::as_tibble()
# Create boxplot of overall variable importance
val_varimp_2016 %>% 
    mutate(num=1:nrow(val_varimp_2016)) %>% 
    pivot_longer(-num, names_to="variable", values_to="varImp") %>% 
    ggplot(aes(x=fct_reorder(variable, varImp), y=varImp)) + 
    geom_boxplot(fill="lightblue") + 
    labs(x="", 
         y="Variable Importance", 
         title="Variable Importance for 1:1 Random Forest Classifications"
         )

# Attach the city names and OOB error rate
tbl_varimp_2016 <- sapply(list_varimp_2016, 
                          FUN=function(x) { c(names(x$errorRate[2:3]), x$errorRate["OOB"]) }
                          ) %>%
    t() %>% 
    as.data.frame() %>% 
    bind_cols(val_varimp_2016) %>% 
    tibble::as_tibble() %>% 
    mutate(OOB=as.numeric(as.character(OOB))) %>%
    rename(locale1=V1, 
           locale2=V2
           )

# Plot accuracy vs. spikiness of variable importance
tbl_varimp_2016 %>%
    pivot_longer(-c(locale1, locale2, OOB), names_to="var", values_to="varImp") %>% 
    group_by(locale1, locale2, OOB) %>% 
    summarize(mean=mean(varImp), max=max(varImp)) %>% 
    mutate(maxMean=max/mean) %>%
    ggplot(aes(x=maxMean, y=1-OOB)) + 
    geom_point() + 
    geom_smooth(method="loess") +
    labs(x="Ratio of Maximum Variable Importance to Mean Variable Importance", 
         y="OOB Accuracy", 
         title="Accuracy vs. Spikiness of Variable Importance"
         )

Broadly speaking, the same variables that drive overall classification are important in driving 1:1 classifications. There is meaningful spikiness, suggesting that different 1:1 classifications rely on different variables.

There is a strong trend where the best accuracies are obtained where there is a single spiky dimension that drives the classifications. This suggests that while the model can take advantage of all 10 variables, it has the easiest tome when there is a single, well-differentiated variable. No surprise.

Random forest regression of temperatures in select locales for 2014-2019

Random forests can also be used to run regressions, such as on variables like temperature or dew point. Models are run for the 2014-2019 data for the locales that have data availability (Chicago, IL; Las Vegas, NV; New Orleans, LA; San Diego, CA):

# Create list of locations
fullDataLocs <- c("Chicago, IL", "Las Vegas, NV", "New Orleans, LA", "San Diego, CA")

# Create a main list, one per locale
lstFullData <- vector("list", length(fullDataLocs))


# Create a list of relevant dependent variables and variables to keep
depVarFull <- c('hrfct', 'DewF', 'modSLP', 'Altimeter', 'WindSpeed', 
                'predomDir', 'minHeight', 'ceilingHeight'
                )
keepVarFull <- c('source', 'dtime', 'locNamefct', 'year', 'month', 'hrfct', 'DewF', 'modSLP', 
                 'Altimeter', 'WindSpeed', 'predomDir', 'minHeight', 'ceilingHeight'
                 )


# Run the regressions by locale and month
nLoc <- 1
for (loc in fullDataLocs) {
    
    # Pull data for only this locale, and where TempF is not missing
    pullData <- metarData %>%
        filter(locNamefct==loc, !is.na(TempF))
    
    # Create the months to be run
    fullDataMonths <- pullData %>%
        count(month) %>%
        pull(month)
    
    # Create containers for each run
    lstFullData[[nLoc]] <- vector("list", length(fullDataMonths))
    
    # Run random forest regression for each month for the locale
    cat("\nBeginning to process:", loc)
    nMonth <- 1
    for (mon in fullDataMonths) {
        
        # Run the regression
        lstFullData[[nLoc]][[nMonth]] <- rfRegression(pullData, 
                                                      depVar="TempF", 
                                                      predVars=depVarFull, 
                                                      otherVar=keepVarFull,
                                                      critFilter=list(locNamefct=loc, month=mon), 
                                                      seed=2007271252, 
                                                      ntree=100, 
                                                      mtry=4, 
                                                      testSize=0.3
                                                      )
        
        # Increment the counter
        nMonth <- nMonth + 1
        cat("\nFinished month:", mon)
    }
    
    # Incerement the counter
    nLoc <- nLoc + 1
    
}
## 
## Beginning to process: Chicago, IL
## Finished month: Jan
## Finished month: Feb
## Finished month: Mar
## Finished month: Apr
## Finished month: May
## Finished month: Jun
## Finished month: Jul
## Finished month: Aug
## Finished month: Sep
## Finished month: Oct
## Finished month: Nov
## Finished month: Dec
## Beginning to process: Las Vegas, NV
## Finished month: Jan
## Finished month: Feb
## Finished month: Mar
## Finished month: Apr
## Finished month: May
## Finished month: Jun
## Finished month: Jul
## Finished month: Aug
## Finished month: Sep
## Finished month: Oct
## Finished month: Nov
## Finished month: Dec
## Beginning to process: New Orleans, LA
## Finished month: Jan
## Finished month: Feb
## Finished month: Mar
## Finished month: Apr
## Finished month: May
## Finished month: Jun
## Finished month: Jul
## Finished month: Aug
## Finished month: Sep
## Finished month: Oct
## Finished month: Nov
## Finished month: Dec
## Beginning to process: San Diego, CA
## Finished month: Jan
## Finished month: Feb
## Finished month: Mar
## Finished month: Apr
## Finished month: May
## Finished month: Jun
## Finished month: Jul
## Finished month: Aug
## Finished month: Sep
## Finished month: Oct
## Finished month: Nov
## Finished month: Dec

The relevant ‘testData’ files can then be combined for an assessment of overall prediction accuracy:

# Helper function to extract testData from inner list
combineTestData <- function(lst, elem="testData") {
    map_dfr(lst, .f=function(x) x[[elem]])
}

# Combine all of the test data files
fullTestData <- map_dfr(lstFullData, .f=combineTestData) %>%
    mutate(err=predicted-TempF, 
           year=factor(year)
           )

# Helper function to create RMSE data
helperCreateRMSE <- function(df, byVar, depVar, errVar="err") {
    
    df %>%
        group_by_at(vars(all_of(byVar))) %>%
        summarize(varTot=var(get(depVar)), varModel=mean(get(errVar)**2)) %>%
        mutate(rmseTot=varTot**0.5, rmseModel=varModel**0.5, rsq=1-varModel/varTot)
    
}

# Create plot for a given by-variable and facet-variable
helperRMSEPlot <- function(df, byVar, depVar, facetVar=NULL) {

    # Create a copy of the original by variable
    byVarOrig <- byVar
    
    # Expand byVar to include facetVar if facetVar is not null
    if (!is.null(facetVar)) {
        byVar <- unique(c(byVar, facetVar))
    }
    
    # Create 
    p1 <- df %>%
        helperCreateRMSE(byVar=byVar, depVar=depVar) %>%
        select_at(vars(all_of(c(byVar, "rmseTot", "rmseModel")))) %>%
        pivot_longer(c(rmseTot, rmseModel), names_to="model", values_to="rmse") %>%
        group_by_at(vars(all_of(byVar))) %>%
        mutate(dRMSE=ifelse(row_number()==n(), rmse, rmse-lead(rmse)), 
               model=factor(model, levels=c("rmseTot", "rmseModel"))
               ) %>%
        ggplot(aes_string(x=byVarOrig, y="dRMSE", fill="model")) + 
        geom_col() + 
        geom_text(data=~filter(., model=="rmseModel"), aes(y=dRMSE/2, label=round(dRMSE, 1))) +
        coord_flip() + 
        labs(x="", y="RMSE", title="RMSE before and after modelling") + 
        scale_fill_discrete("", 
                            breaks=c("rmseModel", "rmseTot"), 
                            labels=c("Final", "Explained by Model")
                            ) + 
        theme(legend.position="bottom")
    # Add facetting if the argument was passed
    if (!is.null(facetVar)) { p1 <- p1 + facet_wrap(as.formula(paste("~", facetVar))) }
    print(p1)
    
}

# Stand-alone on three main dimensions
helperRMSEPlot(fullTestData, byVar="locNamefct", depVar="TempF")

helperRMSEPlot(fullTestData, byVar="year", depVar="TempF")

helperRMSEPlot(fullTestData, byVar="month", depVar="TempF")

# Facetted by locale
helperRMSEPlot(fullTestData, byVar="year", depVar="TempF", facetVar="locNamefct")

helperRMSEPlot(fullTestData, byVar="month", depVar="TempF", facetVar="locNamefct")

Further, an overall decline in MSE can be estimated as the average of the MSE declines in each locale-month:

# Function to extract MSE data from inner lists
helperMSETibble <- function(x) { 
    map_dfr(x, .f=function(y) tibble::tibble(ntree=1:length(y$mse), mse=y$mse)) 
}

map_dfr(lstFullData, .f=function(x) { helperMSETibble(x) }, .id="source") %>%
    group_by(source, ntree) %>%
    summarize(meanmse=mean(mse)) %>%
    ungroup() %>%
    mutate(source=fullDataLocs[as.integer(source)]) %>%
    ggplot(aes(x=ntree, y=meanmse, group=source, color=source)) + 
    geom_line() + 
    ylim(c(0, NA)) + 
    labs(x="# Trees", y="MSE", title="Evolution of Average MSE by Number of Trees")

At 100 trees, the model appears to have largely completed learning, with no more material declines in MSE. Overall, model predictions average 3-4 degrees different from actual temperatures. Deviations are greater in Las Vegas (4-5 degrees), and in the spring in Chicago (4-5 degrees). Deviations are lesser in San Diego (2-3 degrees) and winter in Chicago (2-3 degrees).

The model is then run for all months combined for a single locale, to compare results when month is a trained explanatory variable rather than a segment modelled separately:

# Create list of locations
fullDataLocs <- c("Chicago, IL", "Las Vegas, NV", "New Orleans, LA", "San Diego, CA")

# Create a main list, one per locale
lstFullData_002 <- vector("list", length(fullDataLocs))


# Create a list of relevant dependent variables and variables to keep
depVarFull_002 <- c('month', 'hrfct', 'DewF', 'modSLP', 
                    'Altimeter', 'WindSpeed', 'predomDir', 
                    'minHeight', 'ceilingHeight'
                    )
keepVarFull_002 <- c('source', 'dtime', 'locNamefct', 'year', 'month', 'hrfct', 
                     'DewF', 'modSLP', 'Altimeter', 'WindSpeed', 'predomDir', 
                     'minHeight', 'ceilingHeight'
                     )


# Run the regressions by locale and month
nLoc <- 1
for (loc in fullDataLocs) {
    
    # Pull data for only this locale, and where TempF is not missing
    pullData <- metarData %>%
        filter(locNamefct==loc, !is.na(TempF))
    
    # To be parallel with previous runs, make a length-one list inside locale
    lstFullData_002[[nLoc]] <- vector("list", 1)
    
    # Run random forest regression for each locale
    cat("\nBeginning to process:", loc)
    lstFullData_002[[nLoc]][[1]] <- rfRegression(pullData, 
                                                 depVar="TempF", 
                                                 predVars=depVarFull_002, 
                                                 otherVar=keepVarFull_002,
                                                 critFilter=list(locNamefct=loc), 
                                                 seed=2007281307, 
                                                 ntree=25, 
                                                 mtry=4, 
                                                 testSize=0.3
                                                 )
    
    # Incerement the counter
    nLoc <- nLoc + 1
    
}
## 
## Beginning to process: Chicago, IL
## Beginning to process: Las Vegas, NV
## Beginning to process: New Orleans, LA
## Beginning to process: San Diego, CA

The results can then be compared to the results of the regressions run using month as a segment:

# Combine all of the test data files
fullTestData_002 <- map_dfr(lstFullData_002, .f=combineTestData) %>%
    mutate(err=predicted-TempF, 
           year=factor(year)
           )

# Stand-alone on three main dimensions
helperRMSEPlot(fullTestData_002, byVar="locNamefct", depVar="TempF")

helperRMSEPlot(fullTestData_002, byVar="year", depVar="TempF")

helperRMSEPlot(fullTestData_002, byVar="month", depVar="TempF")

# Facetted by locale
helperRMSEPlot(fullTestData_002, byVar="year", depVar="TempF", facetVar="locNamefct")

helperRMSEPlot(fullTestData_002, byVar="month", depVar="TempF", facetVar="locNamefct")

# Evolution of RMSE
map_dfr(lstFullData_002, .f=function(x) { helperMSETibble(x) }, .id="source") %>%
    group_by(source, ntree) %>%
    summarize(meanmse=mean(mse)) %>%
    ungroup() %>%
    mutate(source=fullDataLocs[as.integer(source)]) %>%
    ggplot(aes(x=ntree, y=meanmse, group=source, color=source)) + 
    geom_line() + 
    ylim(c(0, NA)) + 
    labs(x="# Trees", y="MSE", title="Evolution of Average MSE by Number of Trees")

The prediction qualities and evolution of MSE by number of trees look broadly similar to the results run by locale-month. Notably, month scores high on variable importance:

impList <- lapply(lstFullData_002, FUN=function(x) { 
    locName <- x[[1]]$testData$locNamefct %>% as.character() %>% unique()
    x[[1]]$rfModel$importance %>% 
        as.data.frame() %>%
        rownames_to_column("variable") %>%
        rename_at(vars(all_of("IncNodePurity")), ~locName) %>%
        tibble::as_tibble()
    }
    )

impDF <- Reduce(function(x, y) merge(x, y, all=TRUE), impList)

# Overall variable importance
impDF %>%
    pivot_longer(-variable, names_to="locale", values_to="incPurity") %>%
    ggplot(aes(x=fct_reorder(varMapper[variable], incPurity), y=incPurity)) + 
    geom_col() + 
    coord_flip() + 
    facet_wrap(~locale) + 
    labs(x="", y="Importance", title="Variable Importance by Locale")

# Relative variable importance
impDF %>%
    pivot_longer(-variable, names_to="locale", values_to="incPurity") %>%
    group_by(locale) %>%
    mutate(incPurity=incPurity/sum(incPurity)) %>%
    ggplot(aes(x=fct_reorder(varMapper[variable], incPurity), y=incPurity)) + 
    geom_col() + 
    coord_flip() + 
    facet_wrap(~locale) + 
    labs(x="", y="Relative Importance", title="Relative Variable Importance by Locale")

There is much more underlying variance in the Chicago data, thus greater overall variable importance in Chicago. On a relative basis, locale predictions are driven by:

  • Chicago - Dew Point, Month
  • Las Vegas - Month, Sea-Level Pressure, Hour, Altimeter
  • New Orleans - Dew Point, Month
  • San Diego - Month, Hour, Dew Point

It is interesting to see the similarities in Chicago and New Orleans, with both having strong explanatory power from the combination of dew point and month, despite meaningfully different climates. As in previous analyses, Las Vegas and San Diego look different from each other and also different from Chicago/New Orleans.

XGB regression of temperatures in select locales for 2014-2019

Next, the xgboost (extreme gradient boosting) algorithm is attempted on the METAR dataset. The general recipe from CRAN is followed, which includes several processing steps:

  1. Convert factor variable(s) to binary variables using one-hot-encoding without intercept
  2. Capture the target output variable as a vector
  3. Train the model using xgboost::xgboost (can handle regression or classification)
  4. Check feature importances
  5. Plot the evolution in training RMSE and R-squared
  6. Assess accuracy on test dataset

Next, a very basic xgb model is attempted for predicting temperature. First, data are prepared:

# Take metarData and limit to 4 sources with 2014-2019 data
baseXGBData_big4 <- metarData %>%
    filter(locNamefct %in% c("Las Vegas, NV", "New Orleans, LA", "Chicago, IL", "San Diego, CA"), 
           !is.na(TempF)
           )

# Split in to test and train datasets
idxTrain_big4 <- sample(1:nrow(baseXGBData_big4), size=round(0.7*nrow(baseXGBData_big4)), replace=FALSE)
baseXGBTrain_big4 <- baseXGBData_big4[idxTrain_big4, ]
baseXGBTest_big4 <- baseXGBData_big4[-idxTrain_big4, ]

# Select only variables of interest
xgbTrainInput_big4 <- baseXGBTrain_big4 %>%
    select(TempF, 
           locNamefct, month, hrfct, 
           DewF, modSLP, Altimeter, WindSpeed, 
           predomDir, minHeight, ceilingHeight
           ) %>%
    mutate(locNamefct=fct_drop(locNamefct))

Then, the three modeling steps are run:

# Step 1: Convert to sparse matrix format using one-hot encoding with no intercept
xgbTrainSparse_big4 <- Matrix::sparse.model.matrix(TempF ~ . - 1, data=xgbTrainInput_big4)
str(xgbTrainSparse_big4)
## Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
##   ..@ i       : int [1:1404725] 1 3 12 15 18 21 22 26 37 47 ...
##   ..@ p       : int [1:61] 0 36584 73243 109800 146521 157810 170204 182182 194663 206744 ...
##   ..@ Dim     : int [1:2] 146521 60
##   ..@ Dimnames:List of 2
##   .. ..$ : chr [1:146521] "1" "2" "3" "4" ...
##   .. ..$ : chr [1:60] "locNamefctChicago, IL" "locNamefctLas Vegas, NV" "locNamefctNew Orleans, LA" "locNamefctSan Diego, CA" ...
##   ..@ x       : num [1:1404725] 1 1 1 1 1 1 1 1 1 1 ...
##   ..@ factors : list()
xgbTrainSparse_big4[1:6, ]
## 6 x 60 sparse Matrix of class "dgCMatrix"
##    [[ suppressing 60 column names 'locNamefctChicago, IL', 'locNamefctLas Vegas, NV', 'locNamefctNew Orleans, LA' ... ]]
##                                                                              
## 1 . 1 . . . . 1 . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . .
## 2 1 . . . . . . 1 . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . .
## 3 . . . 1 . . . 1 . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . .
## 4 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
## 5 . 1 . . . . 1 . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . .
## 6 . . 1 . . . . . . . . . . 1 . . . . 1 . . . . . . . . . . . . . . . . . . .
##                                                            
## 1 17.06 1007.2 29.80  8 . . . . . . 1 . . . . . . 1 . . . 1
## 2 48.92 1011.7 29.89 16 . . . . . . 1 . . . . . . 1 . . . 1
## 3 55.94 1012.6 29.90  7 . . . . . . . . 1 . 1 . . . . . . 1
## 4 26.96 1016.9 30.01  8 . . . . . . . . 1 . 1 . . . . . 1 .
## 5  5.00 1016.0 30.05  6 . . . . . 1 . . . . . . . 1 . . . 1
## 6 46.04 1019.4 30.10  9 . . . . 1 . . . . . . . . 1 . . . 1
# Step 2: Create the target output variable as a vector
xgbTrainOutput_big4 <- xgbTrainInput_big4$TempF
str(xgbTrainOutput_big4)
##  num [1:146521] 66.9 75 66.9 33.1 73.9 ...
# Step 3: Train the model using xgboost::xgboost, as regression
xgbModel_big4 <- xgboost::xgboost(data=xgbTrainSparse_big4, 
                                  label=xgbTrainOutput_big4, 
                                  nrounds=200, 
                                  print_every_n=20, 
                                  objective="reg:squarederror"
                                  )
## [1]  train-rmse:47.123901 
## [21] train-rmse:5.213187 
## [41] train-rmse:4.517477 
## [61] train-rmse:4.131727 
## [81] train-rmse:3.929088 
## [101]    train-rmse:3.758572 
## [121]    train-rmse:3.622754 
## [141]    train-rmse:3.511173 
## [161]    train-rmse:3.426437 
## [181]    train-rmse:3.357111 
## [200]    train-rmse:3.290211

Then, the three assessment steps are run:

# Step 4: Assess feature importances
xgbImportance_big4 <- xgboost::xgb.importance(feature_names=colnames(xgbTrainSparse_big4), 
                                              model=xgbModel_big4
                                              )

xgbImportance_big4 %>% 
    column_to_rownames("Feature") %>% 
    round(3)
##                            Gain Cover Frequency
## DewF                      0.417 0.068     0.172
## locNamefctChicago, IL     0.203 0.011     0.025
## Altimeter                 0.084 0.107     0.107
## modSLP                    0.072 0.120     0.146
## ceilingHeightNone         0.037 0.010     0.035
## locNamefctLas Vegas, NV   0.020 0.029     0.041
## monthJul                  0.018 0.011     0.009
## monthJun                  0.015 0.011     0.012
## monthAug                  0.015 0.009     0.008
## monthSep                  0.010 0.012     0.009
## monthDec                  0.008 0.023     0.011
## WindSpeed                 0.007 0.018     0.086
## monthFeb                  0.007 0.025     0.009
## minHeightMedium           0.006 0.006     0.021
## locNamefctSan Diego, CA   0.005 0.008     0.021
## monthMay                  0.005 0.020     0.013
## minHeightNone             0.005 0.015     0.022
## monthOct                  0.005 0.017     0.011
## hrfct21                   0.004 0.014     0.004
## hrfct20                   0.004 0.012     0.005
## minHeightHigh             0.003 0.010     0.018
## locNamefctNew Orleans, LA 0.003 0.007     0.014
## hrfct13                   0.003 0.019     0.003
## hrfct12                   0.003 0.020     0.004
## hrfct11                   0.003 0.020     0.004
## monthApr                  0.003 0.024     0.012
## hrfct22                   0.003 0.013     0.003
## hrfct19                   0.003 0.010     0.004
## hrfct10                   0.003 0.021     0.003
## hrfct23                   0.003 0.009     0.003
## hrfct9                    0.002 0.020     0.003
## minHeightLow              0.002 0.004     0.014
## hrfct18                   0.002 0.015     0.004
## hrfct14                   0.002 0.020     0.004
## monthMar                  0.002 0.022     0.008
## ceilingHeightHigh         0.002 0.004     0.008
## monthNov                  0.001 0.018     0.009
## hrfct8                    0.001 0.022     0.004
## hrfct15                   0.001 0.017     0.005
## predomDirVRB              0.001 0.007     0.003
## hrfct7                    0.001 0.020     0.003
## hrfct17                   0.001 0.012     0.005
## hrfct1                    0.001 0.009     0.004
## hrfct16                   0.001 0.013     0.006
## hrfct6                    0.001 0.020     0.003
## predomDirS                0.001 0.001     0.009
## hrfct2                    0.000 0.009     0.004
## predomDirNE               0.000 0.003     0.008
## ceilingHeightMedium       0.000 0.008     0.005
## hrfct5                    0.000 0.019     0.003
## predomDirSW               0.000 0.001     0.008
## ceilingHeightLow          0.000 0.003     0.006
## predomDirN                0.000 0.002     0.009
## hrfct3                    0.000 0.009     0.005
## predomDirNW               0.000 0.006     0.007
## predomDirW                0.000 0.002     0.008
## predomDirE                0.000 0.002     0.007
## hrfct4                    0.000 0.016     0.003
## predomDirSE               0.000 0.000     0.005
xgbImportance_big4 %>%
    ggplot(aes(x=fct_reorder(Feature, Gain), y=Gain)) + 
    geom_col(fill="lightblue") + 
    geom_text(aes(y=Gain+0.02, label=round(Gain, 3))) + 
    coord_flip() + 
    labs(x="", title="Gain by Variable for TempF modeling with xgboost")

# Step 5: Plot evolution in training data RMSE and R-squared
xgbModel_big4$evaluation_log %>%
    filter(iter %% 5 == 0) %>%
    ggplot(aes(x=iter, y=train_rmse)) + 
    geom_text(aes(label=round(train_rmse, 1)), size=3) + 
    labs(x="Number of iterations", y="Training Set RMSE", title="Evolution of RMSE on training data")

xgbModel_big4$evaluation_log %>%
    filter(iter %% 10 == 0) %>%
    mutate(overall_rmse=sd(baseXGBTrain_big4$TempF), rsq=1-train_rmse**2/overall_rmse**2) %>%
    ggplot(aes(x=iter, y=rsq)) + 
    geom_text(aes(y=rsq, label=round(rsq,3)), size=3) +
    labs(x="Number of iterations", y="Training Set R-squared", title="Evolution of R-squared on training data")

# Step 6: Assess accuracy on test dataset
xgbTestInput_big4 <- baseXGBTest_big4 %>%
    select(TempF, 
           locNamefct, month, hrfct, 
           DewF, modSLP, Altimeter, WindSpeed, 
           predomDir, minHeight, ceilingHeight
           ) %>%
    mutate(locNamefct=fct_drop(locNamefct))

xgbTestSparse_big4 <- Matrix::sparse.model.matrix(TempF ~ . - 1, data=xgbTestInput_big4)

xgbTest_big4 <- xgbTestInput_big4 %>%
    mutate(xgbPred=predict(xgbModel_big4, newdata=xgbTestSparse_big4), err=xgbPred-TempF)

xgbTest_big4 %>%
    group_by(locNamefct) %>%
    summarize(rmse_orig=sd(TempF), rmse_xgb=mean(err**2)**0.5) %>%
    mutate(rsq=1-rmse_xgb**2/rmse_orig**2)
## # A tibble: 4 x 4
##   locNamefct      rmse_orig rmse_xgb   rsq
##   <fct>               <dbl>    <dbl> <dbl>
## 1 Chicago, IL         21.2      3.91 0.966
## 2 Las Vegas, NV       18.3      3.96 0.953
## 3 New Orleans, LA     13.4      3.39 0.936
## 4 San Diego, CA        7.38     3.13 0.820
xgbTest_big4 %>%
    group_by(month) %>%
    summarize(rmse_orig=sd(TempF), rmse_xgb=mean(err**2)**0.5) %>%
    mutate(rsq=1-rmse_xgb**2/rmse_orig**2)
## # A tibble: 12 x 4
##    month rmse_orig rmse_xgb   rsq
##    <fct>     <dbl>    <dbl> <dbl>
##  1 Jan        17.4     3.74 0.954
##  2 Feb        17.6     3.65 0.957
##  3 Mar        14.9     3.76 0.936
##  4 Apr        12.5     4.06 0.894
##  5 May        11.5     3.72 0.895
##  6 Jun        12.3     3.42 0.923
##  7 Jul        11.2     3.22 0.918
##  8 Aug        10.1     3.09 0.907
##  9 Sep        10.1     3.68 0.866
## 10 Oct        11.7     3.76 0.897
## 11 Nov        14.1     3.81 0.927
## 12 Dec        14.2     3.36 0.944
xgbTest_big4 %>%
    group_by(TempF, rndPred=round(xgbPred)) %>%
    summarize(n=n()) %>%
    ggplot(aes(x=TempF, y=rndPred)) + 
    geom_point(aes(size=n), alpha=0.1) + 
    geom_smooth(aes(weight=n)) + 
    geom_abline(lty=2, color="red") + 
    labs(title="XGB predictions vs. actual on test dataset", y="Predicted Temperature", x="Actual Temperature")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

At a glance, initial prediction results are encouraging. The model runs very quickly and gets to a comparable RMSE/R-squared on test data as the random forest. Tuning parameters or adding cross-validation could potentially improve the algorithm further.

An initial conversion to functional form is made, leveraging some of the code already available in rfMultiLocale() and rfTwoLocales():

# Helper function to create sparse matrix without intercept (keep all factor levels)
# Need to adapt to keep all levels of all factors such as caret::dummyVars
helperMakeSparse <- function(tbl, depVar, predVars) {
    
    # FUNCTION ARGUMENTS
    # tbl: the tibble or data frame to be converted
    # depVar: the dependent variable (not to be included in the sparse matrix)
    # predVars: the predictor variables to be converted to sparse format
    
    # Filter to include only predVars then make sprase matrix modelling object
    # Include all contrast levels for every factor variable and exclude the intercept
    tbl %>%
        select_at(vars(all_of(c(predVars)))) %>%
        Matrix::sparse.model.matrix(~ . -1, 
                                    data=., 
                                    contrasts.arg=lapply(.[, sapply(., is.factor)], contrasts, contrasts=FALSE)
                                    )
    
}


# Run xgb model with desired parameters
xgbRunModel <- function(tbl, 
                        depVar, 
                        predVars,
                        otherVars=c("source", "dtime"),
                        critFilter=vector("list", 0),
                        dropEmptyLevels=TRUE,
                        seed=NULL, 
                        nrounds=200, 
                        print_every_n=nrounds, 
                        testSize=0.3, 
                        xgbObjective="reg:squarederror",
                        funcRun=xgboost::xgboost,
                        calcErr=TRUE,
                        ...
                        ) {
    
    # FUNCTION ARGUMENTS:
    # tbl: the data frame or tibble
    # depVar: the dependent variable that will be predicted
    # predVars: explanatory variables for modeling
    # otherVars: other variables to be kept in a final testData file, but not used in modeling
    # critFilter: named list of format list(varName=c(varValues))
    #             will include only observations where get(varName) %in% varValues
    #             vector("list", 0) creates a length-zero list, which never runs in the for loop
    # dropEmptyLevels: boolean, whether to run fct_drop on all variables of class factor after critFilter
    # seed: the random seed (NULL means no seed)
    # nrounds: the maximum number of boosting iterations
    # print_every_n: how frequently to print the progress of training error/accuracy while fitting
    # testSize: the fractional portion of data that should be used as the test dataset
    # xgbObjective: the objective function for xgboost
    # funcRun: the function to run, passed as a function
    # calcErr: boolean, whether to create variable err as predicted-get(depVar)
    # ...: additional arguments to be passed directly to xgboost
    
    # Check that funcName is valid and get the relevant function
    valFuncs <- c("xgboost", "xgb.cv")
    funcName <- as.character(substitute(funcRun))
    if (!(funcName[length(funcName)] %in% valFuncs)) {
        cat("\nFunction is currently only prepared for:", valFuncs, "\n")
        stop("Please change passed argument or update function\n")
    }
    
    # Filter such that only matches to critFilter are included
    for (xNum in seq_len(length(critFilter))) {
        tbl <- tbl %>%
            filter_at(vars(all_of(names(critFilter)[xNum])), ~. %in% critFilter[[xNum]])
    }
    
    # Keep only the depVar, predVar, and otherVars
    tbl <- tbl %>%
        select_at(vars(all_of(c(depVar, predVars, otherVars))))
    
    # Drop empty levels from factors if requested
    if (dropEmptyLevels) {
        tbl <- tbl %>%
            mutate_if(is.factor, .funs=fct_drop)
    }
    
    # Create test-train split
    ttLists <- createTestTrain(tbl, testSize=testSize, seed=seed)
    
    # Set the seed if requested
    if (!is.null(seed)) { set.seed(seed) }
    
    # Pull the dependent variable
    yTrain <- ttLists$trainData[, depVar, drop=TRUE]
    
    # Convert the dependent variable to be integers 0 to (n-1) if xgbObjective is "multi:.*"
    if (str_detect(xgbObjective, pattern="^multi:")) {
        # Convert to factor if passed as anything else
        if (!is.factor(yTrain)) yTrain <- factor(yTrain)
        # Save the factor levels so they can be added back later
        yTrainLevels <- levels(yTrain)
        # Convert to numeric 0 to n-1
        yTrain <- as.integer(yTrain) - 1
    } else {
        yTrainLevels <- NULL
    }
    
    # Convert predictor variables to sparse matrix format keeping only modeling variables
    sparseTrain <- helperMakeSparse(ttLists$trainData, depVar=depVar, predVars=predVars)
    sparseTest <- helperMakeSparse(ttLists$testData, depVar=depVar, predVars=predVars)
    
    # Train model
    xgbModel <- funcRun(data=sparseTrain, 
                        label=yTrain, 
                        nrounds=nrounds, 
                        print_every_n=print_every_n, 
                        objective=xgbObjective, 
                        ...
                        )

    # Create a base testData file that is NULL (will be for xgb.cv) and predData file that is NULL
    testData <- NULL
    predData <- NULL
    
    # Extract testData and add predictions if the model passed is xgboost
    if (funcName[length(funcName)] %in% c("xgboost")) {
        if (xgbObjective=="multi:softprob") {
            predData <- matrix(data=predict(xgbModel, newdata=sparseTest), 
                               nrow=nrow(ttLists$testData), 
                               ncol=length(yTrainLevels), 
                               byrow=TRUE
                               )
            maxCol <- apply(predData, 1, FUN=which.max)
            testData <- ttLists$testData %>%
                mutate(predicted=yTrainLevels[maxCol], probPredicted=apply(predData, 1, FUN=max))
            predData <- predData %>%
                as_tibble() %>%
                purrr::set_names(yTrainLevels)
        } else {
            testData <- ttLists$testData %>%
                mutate(predicted=predict(xgbModel, newdata=sparseTest))
            if (calcErr) { 
                testData <- testData %>% mutate(err=predicted-get(depVar))
            }
        }
    }
    
    # Return list containing funcName, trained model, and testData
    list(funcName=funcName[length(funcName)], 
         xgbModel=xgbModel, 
         testData=testData, 
         predData=predData,
         yTrainLevels=yTrainLevels
         )
    
}
# Define key predictor variables for base XGB runs
baseXGBPreds <- c("locNamefct", "month", "hrfct", 
                  "DewF", "modSLP", "Altimeter", "WindSpeed", 
                  "predomDir", "minHeight", "ceilingHeight"
                  )

# Core multi-year cities
multiYearLocales <- c("Las Vegas, NV", "New Orleans, LA", "Chicago, IL", "San Diego, CA")

# Run the function shell
xgbInit <- xgbRunModel(filter(metarData, !is.na(TempF)), 
                       depVar="TempF", 
                       predVars=baseXGBPreds, 
                       otherVars=c("source", "dtime"), 
                       critFilter=list(locNamefct=multiYearLocales),
                       seed=2008011825,
                       nrounds=2000,
                       print_every_n=50
                       )
## [1]  train-rmse:47.096767 
## [51] train-rmse:4.079394 
## [101]    train-rmse:3.584855 
## [151]    train-rmse:3.299586 
## [201]    train-rmse:3.106099 
## [251]    train-rmse:2.970594 
## [301]    train-rmse:2.866863 
## [351]    train-rmse:2.776774 
## [401]    train-rmse:2.697801 
## [451]    train-rmse:2.626064 
## [501]    train-rmse:2.572620 
## [551]    train-rmse:2.518776 
## [601]    train-rmse:2.469624 
## [651]    train-rmse:2.426183 
## [701]    train-rmse:2.390440 
## [751]    train-rmse:2.350134 
## [801]    train-rmse:2.315014 
## [851]    train-rmse:2.276004 
## [901]    train-rmse:2.244308 
## [951]    train-rmse:2.213070 
## [1001]   train-rmse:2.184253 
## [1051]   train-rmse:2.156494 
## [1101]   train-rmse:2.133281 
## [1151]   train-rmse:2.104960 
## [1201]   train-rmse:2.079930 
## [1251]   train-rmse:2.053286 
## [1301]   train-rmse:2.030751 
## [1351]   train-rmse:2.010247 
## [1401]   train-rmse:1.989779 
## [1451]   train-rmse:1.966085 
## [1501]   train-rmse:1.944863 
## [1551]   train-rmse:1.920303 
## [1601]   train-rmse:1.897431 
## [1651]   train-rmse:1.877888 
## [1701]   train-rmse:1.860289 
## [1751]   train-rmse:1.836947 
## [1801]   train-rmse:1.817471 
## [1851]   train-rmse:1.800320 
## [1901]   train-rmse:1.783760 
## [1951]   train-rmse:1.767166 
## [2000]   train-rmse:1.754318

Functions can then be written to assess the quality of the predictions:

# Create and plot importance for XGB model
plotXGBImportance <- function(mdl, 
                              subList="xgbModel", 
                              featureStems=NULL, 
                              showMainPlot=TRUE, 
                              showStemPlot=!is.null(featureStems), 
                              stemMapper=NULL,
                              plotTitle="Gain by Variable for xgboost", 
                              plotSubtitle=NULL
                              ) {
    
    # FUNCTION ARGUMENTS:
    # mdl: the xgb.Booster model file, or a list containing the xgb.Booster model file
    # subList: if mdl is a list, attempt to pull out item named in subList
    # featureStems: aggregate features starting with this vector of stems, and plot sum of gain 
    #               (NULL means do not do this and just plot the gains "as is" )
    # showMainPlot: boolean, whether to create the full importance plot (just return importance data otherwise)
    # showStemPlot: boolean, whether to create the plot summed by stems
    # stemMapper: mapping file to convert stem variables to descriptive names (NULL means leave as-is)
    # plotTitle: title to be included on the importance plots
    # plotSubtitle: subtitle to be included on the importance plots (NULL means no subtitle)

    # Pull out the modeling data from the list if needed
    if (!("xgb.Booster" %in% class(mdl))) {
        mdl <- mdl[[subList]]
    }
    
    # Pull out the feature importances
    xgbImportance <- xgboost::xgb.importance(model=mdl)
    
    # Helper function to sum data to stem (called below if featureStems is not NULL)
    helperStemTotal <- function(pattern, baseData=xgbImportance) {
        baseData %>%
            filter(grepl(pattern=paste0("^", pattern), x=Feature)) %>%
            select_if(is.numeric) %>%
            colSums()
    }
    
    # Create sums by stem if requested
    if (!is.null(featureStems)) {
        stemTotals <- sapply(featureStems, FUN=function(x) { helperStemTotal(pattern=x) } ) %>%
            t() %>%
            as.data.frame() %>%
            rownames_to_column("Feature") %>%
            tibble::as_tibble()
    } else {
        stemTotals <- NULL
    }

    # Helper function to plot gain by Feature
    helperPlotGain <- function(df, title, subtitle, mapper=NULL, caption=NULL) {
        # Add descriptive name if mapper is passed
        if (!is.null(mapper)) df <- df %>% mutate(Feature=paste0(Feature, "\n", mapper[Feature]))
        p1 <- df %>%
            ggplot(aes(x=fct_reorder(Feature, Gain), y=Gain)) + 
            geom_col(fill="lightblue") + 
            geom_text(aes(y=Gain+0.02, label=round(Gain, 3))) + 
            coord_flip() + 
            labs(x="", title=title)
        if (!is.null(subtitle)) { p1 <- p1 + labs(subtitle=subtitle) }
        if (!is.null(caption)) { p1 <- p1 + labs(caption=caption) }
        print(p1)
    }
    
    # Create and display the plots if requested
    if (showMainPlot) helperPlotGain(xgbImportance, title=plotTitle, subtitle=plotSubtitle)
    if (showStemPlot) { 
        helperPlotGain(stemTotals, 
                       title=plotTitle, 
                       subtitle=plotSubtitle,
                       mapper=stemMapper,
                       caption="Factor variables summed by stem"
                       )
    }
    
    # Return a list containing the raw data and the stemmed data (can be NULL)
    list(importanceData=xgbImportance, stemData=stemTotals)
    
}

# Find and plot importances
xgbInit_importance <- plotXGBImportance(xgbInit, 
                                        featureStems=baseXGBPreds, 
                                        stemMapper = varMapper, 
                                        plotTitle="Gain by variable in xgboost", 
                                        plotSubtitle="Modeling Temperature (F) in 4 Locales 2014-2019"
                                        )

# Function to create and plot RMSE and R-squared evolution of training data
plotXGBTrainEvolution <- function(mdl, 
                                  fullSD,
                                  subList="xgbModel", 
                                  plotRMSE=TRUE,
                                  plotR2=TRUE, 
                                  plot_every=10
                                  ) {
    
    # FUNCTION ARGUMENTS:
    # mdl: the xgb.Booster model file, or a list containing the xgb.Booster model file
    # fullSD: the overall stansard deviation for the training variable
    #         if passed as numeric, use as-is
    #         if passed as character, try to extract sd of that variable from 'testData' in mdl
    # subList: if mdl is a list, attempt to pull out item named in subList
    # plotRMSE: boolean, whether to create the RMSE evolution plot
    # plotR2: boolean, whether to create the R2 evolution plot
    # plot_every: how often to plot the RMSE/R-squared data (e.g., 10 means print iter 10, 20, 30, etc.)

    # Create the full standard deviation from 'testData' if passed as character
    # Must be run before mdl is converted out of list format
    if (is.character(fullSD)) {
        fullSD <- mdl[["testData"]] %>% pull(fullSD) %>% sd()
    }
    
    # Pull out the modeling data from the list if needed
    if (!("xgb.Booster" %in% class(mdl))) {
        mdl <- mdl[[subList]]
    }
    
    # Extract the evaluation log and add an approximated R-squared
    rmseR2 <- mdl[["evaluation_log"]] %>%
        mutate(overall_rmse=fullSD, rsq=1-train_rmse**2/overall_rmse**2) %>%
        tibble::as_tibble()
    
    # Helper function to create requested plot(s)
    helperPlotEvolution <- function(df, yVar, rnd, desc, size=3, plot_every=1) {
        p1 <- df %>%
            filter((iter %% plot_every) == 0) %>%
            ggplot(aes_string(x="iter", y=yVar)) + 
            geom_text(aes(label=round(get(yVar), rnd)), size=size) + 
            labs(x="Number of iterations", 
                 y=paste0("Training Set ", desc), 
                 title=paste0("Evolution of ", desc, " on training data")
                 )
        print(p1)
    }
    
    # Create the RMSE and R-squared plots if requested
    if (plotRMSE) helperPlotEvolution(rmseR2, yVar="train_rmse", rnd=1, desc="RMSE", plot_every=plot_every)
    if (plotR2) helperPlotEvolution(rmseR2, yVar="rsq", rnd=3, desc="R-squared", plot_every=plot_every)
    
    # Return the full evolution data frame
    rmseR2
    
}

# Plot the evolution
xgbInitEvolution <- plotXGBTrainEvolution(xgbInit, fullSD="TempF", plot_every=50)

# Function to report on, and plot, prediction caliber on testData
plotXGBTestData <- function(mdl, 
                            depVar,
                            predVar="predicted",
                            subList="testData", 
                            reportOverall=TRUE,
                            reportBy=NULL, 
                            showPlot=TRUE
                            ) {
    
    # FUNCTION ARGUMENTS:
    # mdl: the test data file, or a list containing the test data file
    # depVar: the variable that was predicted
    # predVar: the variable containing the prediction for depVar
    # subList: if mdl is a list, attempt to pull out item named in subList
    # reportOverall: boolean, whether to report an overall RMSE/R2 on test data
    # reportBy: variable for sumarizing RMSE/R2 by (NULL means no RMSE/R2 by any grouping variables)
    # showPlot: boolean, whether to create/show the plot of predictions vs actuals
    
    # Pull out the modeling data from the list if needed
    if ("list" %in% class(mdl)) {
        mdl <- mdl[[subList]]
    }
    
    # Helper function to print RMSE and R2
    helperReportRMSER2 <- function(df, depVar, errVar) {
        df %>%
            summarize(rmse_orig=sd(get(depVar)), 
                      rmse_xgb=mean((get(predVar)-get(depVar))**2)**0.5
                      ) %>%
            mutate(rsq=1-rmse_xgb**2/rmse_orig**2) %>%
            print()
        cat("\n")
    }
    
    # Report overall RMSE/R2 if requested
    if (reportOverall) {
        cat("\nOVERALL PREDICTIVE PERFORMANCE:\n\n")
        helperReportRMSER2(mdl, depVar=depVar, errVar=errVar)
        cat("\n")
    }
    
    # Report by grouping variables if any
    if (!is.null(reportBy)) {
        cat("\nPREDICTIVE PERFORMANCE BY GROUP(S):\n\n")
        sapply(reportBy, FUN=function(x) { 
            mdl %>% group_by_at(x) %>% helperReportRMSER2(depVar=depVar, errVar=errVar)
        }
        )
        cat("\n")
    }

    # Show overall model performance using rounded temperature and predictions
    if (showPlot) {
        p1 <- mdl %>%
            mutate(rndPred=round(get(predVar))) %>%
            group_by_at(vars(all_of(c(depVar, "rndPred")))) %>%
            summarize(n=n()) %>%
            ggplot(aes_string(x=depVar, y="rndPred")) + 
            geom_point(aes(size=n), alpha=0.1) + 
            geom_smooth(aes(weight=n)) + 
            geom_abline(lty=2, color="red") + 
            labs(title="XGB predictions vs. actual on test dataset", y="Predicted", x="Actual")
        print(p1)
    }
    
}

# Assess performance on test data
plotXGBTestData(xgbInit, depVar="TempF", reportBy=c("locNamefct", "month"))
## 
## OVERALL PREDICTIVE PERFORMANCE:
## 
## # A tibble: 1 x 3
##   rmse_orig rmse_xgb   rsq
##       <dbl>    <dbl> <dbl>
## 1      18.0     2.98 0.972
## 
## 
## 
## PREDICTIVE PERFORMANCE BY GROUP(S):
## 
## # A tibble: 4 x 4
##   locNamefct      rmse_orig rmse_xgb   rsq
##   <fct>               <dbl>    <dbl> <dbl>
## 1 Chicago, IL         21.1      3.24 0.976
## 2 Las Vegas, NV       18.3      2.75 0.977
## 3 New Orleans, LA     13.3      3.03 0.948
## 4 San Diego, CA        7.37     2.89 0.846
## 
## # A tibble: 12 x 4
##    month rmse_orig rmse_xgb   rsq
##    <fct>     <dbl>    <dbl> <dbl>
##  1 Jan        17.0     2.90 0.971
##  2 Feb        17.6     3.07 0.970
##  3 Mar        14.9     3.08 0.957
##  4 Apr        12.4     3.22 0.932
##  5 May        11.6     3.08 0.929
##  6 Jun        12.2     2.82 0.946
##  7 Jul        11.1     2.78 0.937
##  8 Aug        10.1     2.67 0.930
##  9 Sep        10.3     2.90 0.920
## 10 Oct        11.8     3.15 0.929
## 11 Nov        14.1     3.18 0.949
## 12 Dec        14.2     2.88 0.959
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

The XGB model appears to be over-fitting the training data, as reflected by the RMSE gap between the test data set and the training data set:

# Get the random forest by locale RMSE reported by the training process
rfByLocaleRMSE <- sapply(lstFullData_002, FUN=function(x) { x[[1]][["rfModel"]][["mse"]] }) %>%
    as.data.frame() %>%
    tibble::as_tibble() %>%
    mutate(train_rmse=apply(., 1, FUN=mean)**0.5)

# Get the random forest RMSE reported by the test data
rfTestRMSE <- fullTestData_002 %>% 
    summarize(rfTestrmse=mean(err**2)**0.5) %>% 
    pull(rfTestrmse)

# Get the XGB RMSE reported by the training process
xgbInitRMSE <- xgbInit$xgbModel$evaluation_log

# Get the XGB RMSE reported by the test data
xgbTestRMSE <- xgbInit$testData %>%
    summarize(xgbTestrmse=mean(err**2)**0.5) %>%
    pull(xgbTestrmse)

# Plot differences in train/test RMSE
tibble::tibble(model=c("Random Forest", "XGB"), 
               train=c(tail(rfByLocaleRMSE$train_rmse, 1), tail(xgbInitRMSE$train_rmse, 1)), 
               test=c(rfTestRMSE, xgbTestRMSE)
               ) %>%
    pivot_longer(-model, names_to="type", values_to="RMSE") %>%
    mutate(type=factor(type, levels=c("train", "test"))) %>%
    ggplot(aes(x=model, y=RMSE, fill=type)) + 
    geom_col(position="dodge") + 
    geom_text(aes(y=RMSE+0.2, label=round(RMSE, 1)), position=position_dodge(width=1)) +
    labs(x="", title="XGB test RMSE is meaningfully higher than XGB train RMSE (overfitting)") + 
    scale_fill_discrete("") + 
    theme(legend.position="bottom")

So, the XGB model either needs to be tuned, or it needs to run for a much smaller number of iterations. The function xgb.cv allows for cross-validation during the training process, which means that a running count of train/test RMSE is kept by iteration. Suppose that this function is run:

# Create the same training dataset as would have been passed to the previous XGB modeling
xgbInput_big4 <- metarData %>%
    filter(!is.na(TempF), locNamefct %in% multiYearLocales) %>%
    anti_join(xgbInit$testData %>% select(source, dtime)) %>%
    select_at(vars(all_of(c("TempF", baseXGBPreds, "source", "dtime")))) %>%
    mutate_if(is.factor, .funs=fct_drop)
## Joining, by = c("source", "dtime")
# Step 1: Convert to sparse matrix format using one-hot encoding with no intercept
xgbTrain_big4 <- helperMakeSparse(xgbInput_big4, predVars=baseXGBPreds)

# Step 2: Create the target output variable as a vector
xgbTarget_big4 <- xgbInput_big4$TempF

# Step 3: Train the model using xgboost::xgb.cv, as regression
xgbInit_cv <- xgboost::xgb.cv(data=xgbTrain_big4, 
                              label=xgbTarget_big4, 
                              nrounds=2000, 
                              nfold=5,
                              print_every_n=50, 
                              objective="reg:squarederror"
                              )
## [1]  train-rmse:47.098820+0.017976   test-rmse:47.099904+0.076497 
## [51] train-rmse:4.051789+0.021049    test-rmse:4.223803+0.019306 
## [101]    train-rmse:3.539190+0.006330    test-rmse:3.804860+0.016793 
## [151]    train-rmse:3.247462+0.010342    test-rmse:3.581248+0.012934 
## [201]    train-rmse:3.058367+0.008329    test-rmse:3.451586+0.015910 
## [251]    train-rmse:2.916631+0.013438    test-rmse:3.363312+0.014480 
## [301]    train-rmse:2.805518+0.011073    test-rmse:3.299380+0.011600 
## [351]    train-rmse:2.706593+0.006588    test-rmse:3.247507+0.010449 
## [401]    train-rmse:2.625376+0.006445    test-rmse:3.211091+0.008976 
## [451]    train-rmse:2.550301+0.006425    test-rmse:3.181705+0.009290 
## [501]    train-rmse:2.486581+0.009604    test-rmse:3.158292+0.008441 
## [551]    train-rmse:2.429987+0.012186    test-rmse:3.138479+0.008697 
## [601]    train-rmse:2.380056+0.010852    test-rmse:3.124328+0.008938 
## [651]    train-rmse:2.333796+0.011301    test-rmse:3.111427+0.008789 
## [701]    train-rmse:2.290457+0.008911    test-rmse:3.100707+0.009145 
## [751]    train-rmse:2.251564+0.008006    test-rmse:3.093594+0.008037 
## [801]    train-rmse:2.212418+0.007689    test-rmse:3.084205+0.007674 
## [851]    train-rmse:2.176226+0.010150    test-rmse:3.078630+0.006701 
## [901]    train-rmse:2.141665+0.009509    test-rmse:3.073187+0.006579 
## [951]    train-rmse:2.108831+0.009814    test-rmse:3.068691+0.005981 
## [1001]   train-rmse:2.075038+0.008304    test-rmse:3.064253+0.006078 
## [1051]   train-rmse:2.043687+0.008306    test-rmse:3.060923+0.005548 
## [1101]   train-rmse:2.016390+0.008564    test-rmse:3.058380+0.005650 
## [1151]   train-rmse:1.989501+0.006228    test-rmse:3.056679+0.005847 
## [1201]   train-rmse:1.962299+0.006104    test-rmse:3.054455+0.005260 
## [1251]   train-rmse:1.936194+0.007773    test-rmse:3.052410+0.005896 
## [1301]   train-rmse:1.908768+0.007372    test-rmse:3.051539+0.006504 
## [1351]   train-rmse:1.883683+0.007715    test-rmse:3.051109+0.006276 
## [1401]   train-rmse:1.859494+0.006603    test-rmse:3.050993+0.007563 
## [1451]   train-rmse:1.835040+0.004953    test-rmse:3.050164+0.007402 
## [1501]   train-rmse:1.811114+0.003270    test-rmse:3.049802+0.007833 
## [1551]   train-rmse:1.789402+0.002534    test-rmse:3.049649+0.008335 
## [1601]   train-rmse:1.767315+0.001836    test-rmse:3.049585+0.008558 
## [1651]   train-rmse:1.746325+0.001873    test-rmse:3.049909+0.009003 
## [1701]   train-rmse:1.725838+0.002535    test-rmse:3.050378+0.009136 
## [1751]   train-rmse:1.704581+0.003683    test-rmse:3.050737+0.008637 
## [1801]   train-rmse:1.685065+0.004633    test-rmse:3.050562+0.008606 
## [1851]   train-rmse:1.664703+0.004838    test-rmse:3.051120+0.008569 
## [1901]   train-rmse:1.645825+0.004266    test-rmse:3.051313+0.008286 
## [1951]   train-rmse:1.626805+0.004438    test-rmse:3.052526+0.008683 
## [2000]   train-rmse:1.608829+0.005378    test-rmse:3.053323+0.008541

The evolution of RMSE can then be plotted:

xgbInit_cv$evaluation_log %>%
    select(iter, train=train_rmse_mean, test=test_rmse_mean) %>%
    pivot_longer(-iter, names_to="type", values_to="rmse") %>%
    filter(iter > 10) %>%
    ggplot(aes(x=iter, y=rmse, color=type, group=type)) + 
    geom_line() + 
    scale_color_discrete("") +
    labs(x="# Iterations", 
         y="RMSE of Temperature (F)", 
         title="XGB Model RMSE when trained using Cross Validation (5-fold)", 
         subtitle="plot excludes first 10 iterations"
         )

There is a significant divergence between train RMSE and test RMSE, with test RMSE mostly being optimized already after 500-1000 iterations, while train RMSE continues to improve almost linearly even after 1000 iterations. Unlike random forests, which minimize variance through bootstrap resampling and variable shuffling at each split, the XGB model continues to tune on residuals in the same dataset even after it has learned every pattern that can be generalized to unseen data. Proper use of CV and early-stopping will be important.

Notably, XGB has achieved roughly 3.0 degrees RMSE with the test data while random forest achieved roughly 3.5 degrees with the test data. In part this is driven by the very fast optimization of XGB allowing for many more iterations than can be run in the same time for a random forest.

Potential next steps are to incorporate xgb.cv as an option in the training function, and to explore hyperparameters such as eta (learning rate) and maximum depth, and their impacts on the test set RMSE.

Function xgbRunModel has been updated to accept either xgboost::xgboost or xgboost::xgb.cv. This allows for running several versions of the hyperparameters through the CV process. For example:

# Create hyperparameter space
hypGrid <- expand.grid(eta=c(0.1, 0.3, 0.6), 
                       max_depth=c(3, 6, 10), 
                       nrounds=500, 
                       nfold=5
                       )

# Create containers for results
xgbSmall <- vector("list", nrow(hypGrid))

# Run xgb.cv once for each combination of hyper-parameters
for (rowNum in 1:nrow(hypGrid)) {

    # Extract the relevant hyperparameter data row
    params <- hypGrid[rowNum, ] %>% unlist()
    
    # Run the function for 5-fold CV for a few values of eta and max_depth
    xgbTemp <- xgbRunModel(filter(metarData, !is.na(TempF)), 
                           depVar="TempF", 
                           predVars=baseXGBPreds, 
                           otherVars=c("source", "dtime"), 
                           critFilter=list(locNamefct=multiYearLocales),
                           seed=2008041345,
                           nrounds=params["nrounds"],
                           eta=params["eta"], 
                           max_depth=params["max_depth"],
                           print_every_n=50, 
                           funcRun=xgboost::xgb.cv, 
                           nfold=params["nfold"]
                           )
    
    xgbSmall[[rowNum]] <- list(params=params, results=xgbTemp)
    
}
## [1]  train-rmse:60.291858+0.029840   test-rmse:60.291836+0.122552 
## [51] train-rmse:6.952588+0.022743    test-rmse:6.967671+0.028935 
## [101]    train-rmse:5.911296+0.030230    test-rmse:5.934121+0.031484 
## [151]    train-rmse:5.440318+0.028012    test-rmse:5.468477+0.030610 
## [201]    train-rmse:5.143110+0.026557    test-rmse:5.177588+0.031563 
## [251]    train-rmse:4.938329+0.027705    test-rmse:4.977193+0.034568 
## [301]    train-rmse:4.777454+0.021411    test-rmse:4.820399+0.026625 
## [351]    train-rmse:4.632995+0.018280    test-rmse:4.677907+0.027320 
## [401]    train-rmse:4.504584+0.022960    test-rmse:4.552695+0.030454 
## [451]    train-rmse:4.400923+0.020546    test-rmse:4.452110+0.027658 
## [500]    train-rmse:4.318087+0.017160    test-rmse:4.371721+0.020521 
## [1]  train-rmse:47.414653+0.025543   test-rmse:47.416267+0.110105 
## [51] train-rmse:5.505623+0.039939    test-rmse:5.542213+0.055560 
## [101]    train-rmse:4.837401+0.018193    test-rmse:4.887497+0.028972 
## [151]    train-rmse:4.441178+0.027482    test-rmse:4.499732+0.035334 
## [201]    train-rmse:4.214036+0.027466    test-rmse:4.279759+0.029091 
## [251]    train-rmse:4.061037+0.018717    test-rmse:4.133555+0.019708 
## [301]    train-rmse:3.951799+0.025049    test-rmse:4.028980+0.023754 
## [351]    train-rmse:3.865718+0.023845    test-rmse:3.949121+0.023304 
## [401]    train-rmse:3.794199+0.018816    test-rmse:3.882646+0.017903 
## [451]    train-rmse:3.730279+0.018420    test-rmse:3.823405+0.016648 
## [500]    train-rmse:3.681347+0.016766    test-rmse:3.779265+0.015579 
## [1]  train-rmse:28.598478+0.022489   test-rmse:28.602791+0.088590 
## [51] train-rmse:5.107773+0.055545    test-rmse:5.154449+0.062315 
## [101]    train-rmse:4.415871+0.025289    test-rmse:4.478973+0.021201 
## [151]    train-rmse:4.132172+0.022054    test-rmse:4.211309+0.030999 
## [201]    train-rmse:3.941383+0.022689    test-rmse:4.033375+0.025258 
## [251]    train-rmse:3.801489+0.009797    test-rmse:3.904041+0.018229 
## [301]    train-rmse:3.690903+0.008401    test-rmse:3.806468+0.015299 
## [351]    train-rmse:3.600958+0.010827    test-rmse:3.725559+0.017532 
## [401]    train-rmse:3.528479+0.008935    test-rmse:3.662803+0.017003 
## [451]    train-rmse:3.467037+0.011508    test-rmse:3.609574+0.018976 
## [500]    train-rmse:3.411157+0.012271    test-rmse:3.560714+0.014885 
## [1]  train-rmse:60.209776+0.029499   test-rmse:60.210296+0.117814 
## [51] train-rmse:5.102061+0.012033    test-rmse:5.171496+0.017799 
## [101]    train-rmse:4.363518+0.019358    test-rmse:4.479946+0.015077 
## [151]    train-rmse:4.008150+0.011056    test-rmse:4.166304+0.019110 
## [201]    train-rmse:3.788664+0.005680    test-rmse:3.981695+0.017881 
## [251]    train-rmse:3.622718+0.006958    test-rmse:3.847228+0.020058 
## [301]    train-rmse:3.491297+0.008235    test-rmse:3.742838+0.018430 
## [351]    train-rmse:3.382102+0.008616    test-rmse:3.656639+0.015591 
## [401]    train-rmse:3.288474+0.008511    test-rmse:3.584193+0.017333 
## [451]    train-rmse:3.210271+0.009739    test-rmse:3.525187+0.016061 
## [500]    train-rmse:3.141030+0.008145    test-rmse:3.474401+0.017219 
## [1]  train-rmse:47.134219+0.024267   test-rmse:47.136035+0.096315 
## [51] train-rmse:4.045615+0.025654    test-rmse:4.213756+0.033934 
## [101]    train-rmse:3.535829+0.009401    test-rmse:3.796318+0.014716 
## [151]    train-rmse:3.259137+0.008667    test-rmse:3.586416+0.015300 
## [201]    train-rmse:3.068447+0.007301    test-rmse:3.453034+0.014511 
## [251]    train-rmse:2.920823+0.009054    test-rmse:3.359908+0.012705 
## [301]    train-rmse:2.804430+0.009715    test-rmse:3.295715+0.010669 
## [351]    train-rmse:2.708823+0.006058    test-rmse:3.248712+0.010730 
## [401]    train-rmse:2.624162+0.008032    test-rmse:3.212151+0.012557 
## [451]    train-rmse:2.551436+0.007545    test-rmse:3.184392+0.013666 
## [500]    train-rmse:2.488435+0.005479    test-rmse:3.160857+0.013264 
## [1]  train-rmse:27.817778+0.017617   test-rmse:27.823969+0.065229 
## [51] train-rmse:3.703173+0.029105    test-rmse:3.974862+0.032755 
## [101]    train-rmse:3.196706+0.018067    test-rmse:3.613539+0.022093 
## [151]    train-rmse:2.931391+0.021877    test-rmse:3.464887+0.027318 
## [201]    train-rmse:2.751677+0.016366    test-rmse:3.390184+0.023319 
## [251]    train-rmse:2.616835+0.013893    test-rmse:3.351378+0.021288 
## [301]    train-rmse:2.501442+0.009328    test-rmse:3.325782+0.019926 
## [351]    train-rmse:2.405215+0.005827    test-rmse:3.308294+0.019058 
## [401]    train-rmse:2.324063+0.006549    test-rmse:3.296866+0.018918 
## [451]    train-rmse:2.248729+0.008088    test-rmse:3.289588+0.016957 
## [500]    train-rmse:2.177265+0.006178    test-rmse:3.285350+0.016165 
## [1]  train-rmse:60.177299+0.028919   test-rmse:60.177698+0.114226 
## [51] train-rmse:3.881933+0.016210    test-rmse:4.260229+0.017132 
## [101]    train-rmse:3.137727+0.015027    test-rmse:3.773720+0.026478 
## [151]    train-rmse:2.710896+0.008586    test-rmse:3.534585+0.022632 
## [201]    train-rmse:2.448061+0.002653    test-rmse:3.407226+0.018462 
## [251]    train-rmse:2.251032+0.006251    test-rmse:3.322486+0.015428 
## [301]    train-rmse:2.097336+0.008329    test-rmse:3.267107+0.015011 
## [351]    train-rmse:1.969215+0.011110    test-rmse:3.228843+0.014409 
## [401]    train-rmse:1.862160+0.016369    test-rmse:3.200573+0.013999 
## [451]    train-rmse:1.764049+0.012783    test-rmse:3.179215+0.012618 
## [500]    train-rmse:1.677405+0.012126    test-rmse:3.163679+0.010797 
## [1]  train-rmse:47.019374+0.022382   test-rmse:47.021793+0.085563 
## [51] train-rmse:2.801653+0.022126    test-rmse:3.675591+0.026313 
## [101]    train-rmse:2.159013+0.010218    test-rmse:3.419520+0.012152 
## [151]    train-rmse:1.800193+0.012705    test-rmse:3.341851+0.011634 
## [201]    train-rmse:1.550361+0.009200    test-rmse:3.313317+0.010477 
## [251]    train-rmse:1.359007+0.013033    test-rmse:3.300204+0.010413 
## [301]    train-rmse:1.199933+0.009052    test-rmse:3.295912+0.009275 
## [351]    train-rmse:1.071990+0.011760    test-rmse:3.296180+0.007863 
## [401]    train-rmse:0.963536+0.007773    test-rmse:3.296334+0.007524 
## [451]    train-rmse:0.865972+0.006748    test-rmse:3.297515+0.008136 
## [500]    train-rmse:0.780306+0.006435    test-rmse:3.299618+0.007300 
## [1]  train-rmse:27.480704+0.012962   test-rmse:27.490717+0.044583 
## [51] train-rmse:2.346123+0.023517    test-rmse:3.765403+0.026814 
## [101]    train-rmse:1.710224+0.021395    test-rmse:3.707509+0.029322 
## [151]    train-rmse:1.339575+0.008671    test-rmse:3.711507+0.028175 
## [201]    train-rmse:1.058404+0.007993    test-rmse:3.724789+0.028353 
## [251]    train-rmse:0.846464+0.012995    test-rmse:3.737257+0.026698 
## [301]    train-rmse:0.675616+0.010724    test-rmse:3.745457+0.024722 
## [351]    train-rmse:0.552008+0.013427    test-rmse:3.751203+0.023449 
## [401]    train-rmse:0.446647+0.009868    test-rmse:3.757080+0.022794 
## [451]    train-rmse:0.364546+0.009564    test-rmse:3.761381+0.022925 
## [500]    train-rmse:0.298650+0.006166    test-rmse:3.764331+0.022158

The test and train RMSE by iteration can be extracted and plotted by combination of parameters:

# Extract the key parameters as a character vector
allParams <- sapply(xgbSmall, FUN=function(x) x[["params"]])
keyParamVec <- allParams %>% 
    apply(2, FUN=function(x) paste0(names(x)[1], ": ", x[1], ", ", names(x)[2], ": ", x[2]))

# Extract RMSE and attach character vector
allRMSE <- map_dfr(xgbSmall, 
                   .f=function(x) x[["results"]][["xgbModel"]]$evaluation_log, 
                   .id="source"
                   ) %>%
    mutate(desc=factor(keyParamVec[as.numeric(source)], levels=keyParamVec)) %>%
    tibble::as_tibble()

# Get the minimum test RMSE
minTestRMSE <- allRMSE %>% 
    select(test_rmse_mean) %>% 
    min()

# Create plot for final test RMSE
allRMSE %>%
    select(desc, iter, train=train_rmse_mean, test=test_rmse_mean) %>%
    filter(iter==max(iter)) %>%
    ggplot(aes(x=desc, y=test)) + 
    geom_col(fill="lightblue") + 
    geom_text(aes(y=test/2, label=round(test, 1))) + 
    labs(x="", y="Test RMSE after 500 iterations", title="Test RMSE at 500 iterations by hyperparameters") +
    coord_flip()

# Create plot for RMSE evolution
allRMSE %>%
    select(desc, iter, train=train_rmse_mean, test=test_rmse_mean) %>%
    pivot_longer(-c(desc, iter), names_to="type", values_to="RMSE") %>%
    filter(RMSE <= 8) %>%
    ggplot(aes(x=iter, y=RMSE, group=type, color=type)) + 
    geom_line() + 
    labs(x="# Iterations", title="RMSE Evolution by Hyperparameters", subtitle="Only RMSE <= 8 plotted") +
    geom_hline(aes(yintercept=minTestRMSE), color="red", lty=2) +
    geom_vline(aes(xintercept=10), lty=2) +
    facet_wrap(~desc)

As expected, increasing eta and max_depth have a tendency to induce over-fitting while also driving to the optimal RMSE quicker. A trade-off. The default eta=0.3 and max_depth=6 appear to be close to the minimum RMSE at 500 observations with a rather modest delta between test/train RMSE. The high over-fit model (eta 0.6 and max_depth 10) appears to have converged with a high test RMSE. The slowest model (eta 0.1, max_depth 3) appears to still have significant room to learn even after 500 observations.

The top performing models at 500 iterations appear to be blends of parameters that average out to “moderate-high” learning ability (low eta=0.1 with high max_depth=10, medium eta=0.3 with medium max_depth=6, medium eta=0.3 with high max_depth=10, high eta=0.6 with medium max_depth=6).

Suppose the slowest model is run for 2500 iterations to check on convergence:

# Define key predictor variables for base XGB runs
baseXGBPreds <- c("locNamefct", "month", "hrfct", 
                  "DewF", "modSLP", "Altimeter", "WindSpeed", 
                  "predomDir", "minHeight", "ceilingHeight"
                  )

# Core multi-year cities
multiYearLocales <- c("Las Vegas, NV", "New Orleans, LA", "Chicago, IL", "San Diego, CA")

# Run the function shell
xgbSlow <- xgbRunModel(filter(metarData, !is.na(TempF)), 
                       depVar="TempF", 
                       predVars=baseXGBPreds, 
                       otherVars=c("source", "dtime"), 
                       critFilter=list(locNamefct=multiYearLocales),
                       seed=2008041432,
                       nrounds=2500,
                       print_every_n=50, 
                       eta=0.1, 
                       max_depth=3
                       )
## [1]  train-rmse:60.254131 
## [51] train-rmse:6.966332 
## [101]    train-rmse:5.919064 
## [151]    train-rmse:5.452974 
## [201]    train-rmse:5.163146 
## [251]    train-rmse:4.979972 
## [301]    train-rmse:4.815380 
## [351]    train-rmse:4.661036 
## [401]    train-rmse:4.540689 
## [451]    train-rmse:4.424547 
## [501]    train-rmse:4.343824 
## [551]    train-rmse:4.269697 
## [601]    train-rmse:4.204936 
## [651]    train-rmse:4.158858 
## [701]    train-rmse:4.111222 
## [751]    train-rmse:4.061322 
## [801]    train-rmse:4.019222 
## [851]    train-rmse:3.981936 
## [901]    train-rmse:3.936117 
## [951]    train-rmse:3.910823 
## [1001]   train-rmse:3.884617 
## [1051]   train-rmse:3.862150 
## [1101]   train-rmse:3.838576 
## [1151]   train-rmse:3.815784 
## [1201]   train-rmse:3.794541 
## [1251]   train-rmse:3.777628 
## [1301]   train-rmse:3.758889 
## [1351]   train-rmse:3.743135 
## [1401]   train-rmse:3.727739 
## [1451]   train-rmse:3.709213 
## [1501]   train-rmse:3.688144 
## [1551]   train-rmse:3.676046 
## [1601]   train-rmse:3.661273 
## [1651]   train-rmse:3.644699 
## [1701]   train-rmse:3.625214 
## [1751]   train-rmse:3.612606 
## [1801]   train-rmse:3.600801 
## [1851]   train-rmse:3.581740 
## [1901]   train-rmse:3.570685 
## [1951]   train-rmse:3.559159 
## [2001]   train-rmse:3.544988 
## [2051]   train-rmse:3.535298 
## [2101]   train-rmse:3.526862 
## [2151]   train-rmse:3.515605 
## [2201]   train-rmse:3.503624 
## [2251]   train-rmse:3.493750 
## [2301]   train-rmse:3.483091 
## [2351]   train-rmse:3.475146 
## [2401]   train-rmse:3.467263 
## [2451]   train-rmse:3.458688 
## [2500]   train-rmse:3.450731

The evalutaion functions can then be run:

# Plot the evolution
xgbSlowEvolution <- plotXGBTrainEvolution(xgbSlow, fullSD="TempF", plot_every=100)

# Assess performance on test data
plotXGBTestData(xgbSlow, depVar="TempF", reportBy=c("locNamefct", "month"))
## 
## OVERALL PREDICTIVE PERFORMANCE:
## 
## # A tibble: 1 x 3
##   rmse_orig rmse_xgb   rsq
##       <dbl>    <dbl> <dbl>
## 1      18.0     3.54 0.961
## 
## 
## 
## PREDICTIVE PERFORMANCE BY GROUP(S):
## 
## # A tibble: 4 x 4
##   locNamefct      rmse_orig rmse_xgb   rsq
##   <fct>               <dbl>    <dbl> <dbl>
## 1 Chicago, IL         21.1      4.01 0.964
## 2 Las Vegas, NV       18.2      3.29 0.967
## 3 New Orleans, LA     13.4      3.47 0.933
## 4 San Diego, CA        7.35     3.32 0.796
## 
## # A tibble: 12 x 4
##    month rmse_orig rmse_xgb   rsq
##    <fct>     <dbl>    <dbl> <dbl>
##  1 Jan        17.2     3.42 0.960
##  2 Feb        17.7     3.73 0.956
##  3 Mar        14.8     3.70 0.937
##  4 Apr        12.5     4.00 0.898
##  5 May        11.4     3.78 0.890
##  6 Jun        12.0     3.33 0.923
##  7 Jul        11.1     3.06 0.924
##  8 Aug        10.3     2.88 0.922
##  9 Sep        10.2     3.38 0.890
## 10 Oct        11.6     3.72 0.896
## 11 Nov        14.0     3.86 0.924
## 12 Dec        14.2     3.42 0.942
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

Even at 2500 iterations, the model has almost no over-fitting (train RMSE 3.5, test RMSE 3.5). However, the model appears to be converging at several tenths of a degree higher test RMSE than some of the other models achieve in as little as 500 iterations.

XGB classification of locales for 2016

The XGB approach can also be followed for classifications, by passing a factor variable for y and updating the objective function and metric to be appropriate for classification. The general recipe is similar to what is followed for regression.

To begin, a simple classification will be run using 2016 data to assess “is this data from Las Vegas?”. Las Vegas is selected because it is largely distinct from the other climate types. For further simplification of the first pass, Phoenix data will be deleted, and the “all other” cities will be reduced to being the same data volume as Las Vegas:

set.seed(2008051312)

# Extract a balanced sample of Las Vegas 2016 data and all-other 2016 data (excluding Phoenix)
las2016Data <- metarData %>%
    filter(year==2016, !is.na(TempF), !(locNamefct %in% c("Phoenix, AZ"))) %>%
    mutate(isLAS=factor(ifelse(locNamefct=="Las Vegas, NV", "Las Vegas", "All Other"), 
                        levels=c("Las Vegas", "All Other")
                        ), 
           nLAS=sum(isLAS=="Las Vegas")
           ) %>%
    group_by(isLAS) %>%
    sample_n(min(nLAS)) %>% # not ideal coding, but it works
    ungroup()

First, the random forest approach is run on the Las Vegas 2016 data:

# Define key predictor variables for base XGB runs
locXGBPreds <- c("month", "hrfct", 
                 "TempF", "DewF", "modSLP", "Altimeter", "WindSpeed", 
                 "predomDir", "minHeight", "ceilingHeight"
                 )

# Run random forest for Las Vegas 2016 classifications
rf_las2016 <- rfMultiLocale(las2016Data, 
                            vrbls=locXGBPreds,
                            otherVar=keepVarFull,
                            locs=NULL, 
                            locVar="isLAS",
                            pred="isLAS",
                            ntree=100, 
                            seed=2008051316, 
                            mtry=4
                            )
## 
## Running for locations:
## [1] "Las Vegas" "All Other"

The variable importances and quality of the predictions can then be assessed:

# Plot variable importances
helperPlotVarImp(rf_las2016$rfModel)

# Evaluate prediction accuracy
evalPredictions(rf_las2016, 
                plotCaption = "Temp, Dew Point, Month, Hour of Day, Cloud Height, Wind, SLP, Altimeter", 
                keyVar="isLAS"
                )

## # A tibble: 4 x 5
##   locale    predicted correct     n    pct
##   <fct>     <fct>     <lgl>   <int>  <dbl>
## 1 Las Vegas Las Vegas TRUE     2496 0.952 
## 2 Las Vegas All Other FALSE     125 0.0477
## 3 All Other Las Vegas FALSE     108 0.0412
## 4 All Other All Other TRUE     2512 0.959

Using majority-rules classification, the model achieves balanced 96% accuracy in classifying locales as being Las Vegas or All Other. The main variables that assist in the classification are dew point, minimum cloud height, and temperature. This seems plausible as Las Vegas is rarely cloudy (and almost never at low levels) and shows a significantly different density on a dewpoint/temperature plot than most other locales:

plotOrder <- rev(levels(rf_las2016$testData$minHeight))

p1_las2016 <- rf_las2016$testData %>%
    ggplot(aes(x=isLAS, fill=factor(minHeight, levels=plotOrder))) + 
    geom_bar(position="fill") + 
    coord_flip() + 
    labs(x="", y="Percentage of observations", title="Minimum cloud heights") + 
    scale_fill_discrete("Min Cloud Height", guide=guide_legend(reverse=TRUE, nrow=2, byrow=TRUE), 
                        labels=plotOrder, breaks=plotOrder
                        ) + 
    theme(legend.position="bottom")

p2_las2016 <- rf_las2016$testData %>%
    ggplot(aes(x=TempF, y=DewF)) + 
    geom_point(alpha=0.1) + 
    labs(x="Temperature (F)", y="Dew Point (F)", title="Temp/Dew Point Distribution") + 
    facet_wrap(~isLAS)

gridExtra::grid.arrange(p1_las2016, p2_las2016, nrow=1)

The modesl has 80%+ votes in one direction or ther other about 80% of the time. These classifications are ~99% accurate. When the model is less confident, the prediction quality declines to the 70%-80% range:

# Evaluate prediction certainty
probs_las2016 <- 
    assessPredictionCertainty(rf_las2016, 
                              keyVar="isLAS", 
                              plotCaption="Temp, Dew Point, Month/Hour, Clouds, Wind, SLP, Altimeter", 
                              showAcc=TRUE
                              )

The process is then run using xgb, with the following modifications:

  1. Convert the isLAS variable to 0/1
  2. Update the objective function to be logistic

First, the data are reshaped and the CV process is run to see where the test error stabilizes:

las2016Data <- las2016Data %>%
    mutate(binLAS=ifelse(isLAS=="Las Vegas", 1, 0))
# Run the function shell
xgb_las2016_cv <- xgbRunModel(las2016Data, 
                              depVar="binLAS", 
                              predVars=locXGBPreds, 
                              otherVars=keepVarFull, 
                              seed=2008051405,
                              nrounds=1000,
                              print_every_n=50, 
                              xgbObjective="binary:logistic", 
                              funcRun=xgboost::xgb.cv, 
                              nfold=5
                              )
## [1]  train-error:0.089480+0.002471   test-error:0.100826+0.005528 
## [51] train-error:0.011428+0.001170   test-error:0.038433+0.002412 
## [101]    train-error:0.002208+0.000704   test-error:0.031809+0.002303 
## [151]    train-error:0.000470+0.000153   test-error:0.030664+0.002807 
## [201]    train-error:0.000020+0.000041   test-error:0.029602+0.001874 
## [251]    train-error:0.000000+0.000000   test-error:0.028784+0.002404 
## [301]    train-error:0.000000+0.000000   test-error:0.028866+0.002539 
## [351]    train-error:0.000000+0.000000   test-error:0.028293+0.001870 
## [401]    train-error:0.000000+0.000000   test-error:0.027721+0.002242 
## [451]    train-error:0.000000+0.000000   test-error:0.027721+0.002799 
## [501]    train-error:0.000000+0.000000   test-error:0.027721+0.002301 
## [551]    train-error:0.000000+0.000000   test-error:0.027476+0.002811 
## [601]    train-error:0.000000+0.000000   test-error:0.027312+0.002950 
## [651]    train-error:0.000000+0.000000   test-error:0.027230+0.003128 
## [701]    train-error:0.000000+0.000000   test-error:0.027557+0.002997 
## [751]    train-error:0.000000+0.000000   test-error:0.027148+0.003020 
## [801]    train-error:0.000000+0.000000   test-error:0.027721+0.002627 
## [851]    train-error:0.000000+0.000000   test-error:0.027557+0.002513 
## [901]    train-error:0.000000+0.000000   test-error:0.027230+0.002766 
## [951]    train-error:0.000000+0.000000   test-error:0.027230+0.002604 
## [1000]   train-error:0.000000+0.000000   test-error:0.027148+0.002500

Error rate evolution can be plotted:

# Calculate minimum test error
minTestError <- min(xgb_las2016_cv$xgbModel$evaluation_log$test_error_mean)

# Create plot for RMSE evolution
xgb_las2016_cv$xgbModel$evaluation_log %>%
    select(-contains("std")) %>%
    select(iter, train=train_error_mean, test=test_error_mean) %>%
    pivot_longer(-c(iter), names_to="type", values_to="RMSE") %>%
    ggplot(aes(x=iter, y=RMSE, group=type, color=type)) + 
    geom_line() + 
    geom_hline(aes(yintercept=minTestError), color="red", lty=2) +
    labs(x="# Iterations", title="RMSE Evolution")

Test error appears to be minimized in the 250-500 iterations range, while train error has been driven to zero within the first few hundred iterations. A full XGB process is run using 500 iterations:

# Run the function shell
xgb_las2016 <- xgbRunModel(las2016Data, 
                           depVar="binLAS", 
                           predVars=locXGBPreds, 
                           otherVars=keepVarFull, 
                           seed=2008051405,
                           nrounds=500,
                           print_every_n=25, 
                           xgbObjective="binary:logistic"
                           )
## [1]  train-error:0.096001 
## [26] train-error:0.025513 
## [51] train-error:0.012429 
## [76] train-error:0.006215 
## [101]    train-error:0.002535 
## [126]    train-error:0.001554 
## [151]    train-error:0.000409 
## [176]    train-error:0.000082 
## [201]    train-error:0.000000 
## [226]    train-error:0.000000 
## [251]    train-error:0.000000 
## [276]    train-error:0.000000 
## [301]    train-error:0.000000 
## [326]    train-error:0.000000 
## [351]    train-error:0.000000 
## [376]    train-error:0.000000 
## [401]    train-error:0.000000 
## [426]    train-error:0.000000 
## [451]    train-error:0.000000 
## [476]    train-error:0.000000 
## [500]    train-error:0.000000

Accuracy of the classifications can then be assessed by investigating the “predicted” column of the testData frame:

# Overall classification probability histogram
xgb_las2016$testData %>%
    ggplot(aes(x=predicted)) + 
    geom_histogram(fill="lightblue") + 
    labs(x="Predicted Probability of Las Vegas", y="", 
         title="Prediction Probabilities - Las Vegas vs. Other"
         )
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

# Overall accuracy by bin
xgb_las2016$testData %>%
    mutate(binPred=ifelse(predicted>=0.5, 1, 0)) %>%
    group_by(binLAS) %>%
    summarize(accuracy=mean(binLAS==binPred))
## # A tibble: 2 x 2
##   binLAS accuracy
##    <dbl>    <dbl>
## 1      0    0.964
## 2      1    0.986
# Accuracy by predicted probability (whether certainty is at least 80%)
xgb_las2016$testData %>%
    mutate(binPred=ifelse(predicted>=0.5, 1, 0), 
           confPred=ifelse(abs(predicted-0.5)>=0.3, "High", "Low")
           ) %>%
    group_by(confPred, binLAS) %>%
    summarize(accuracy=mean(binLAS==binPred), n=n())
## # A tibble: 4 x 4
## # Groups:   confPred [2]
##   confPred binLAS accuracy     n
##   <chr>     <dbl>    <dbl> <int>
## 1 High          0    0.975  2608
## 2 High          1    0.994  2485
## 3 Low           0    0.558    77
## 4 Low           1    0.690    71

Overall accuracy is in the 97% range, roughly 1% higher than is achieved using the random forest algorithm. The XGB model is slightly more likely to make a high-confidence error in classifying a non-Las Vegas locale as Las Vegas, and much less likely to have a low confidence in its predictions.

Variable importance can also be assessed:

# Find and plot importances
xgb_las2016_importance <- plotXGBImportance(xgb_las2016, 
                                            featureStems=locXGBPreds, 
                                            stemMapper = varMapper, 
                                            plotTitle="Gain by variable in xgboost", 
                                            plotSubtitle="Modeling 2016 Locale (Las Vegas vs, Other)"
                                            )

At a glance, the XGB algorithm makes high use of dew point and temperature, consistent with the random forest algorithm. In contrast, the XGB algorithm prefers to use sea-level pressure next while the random forest model prefers to use minimum cloud height next.

Lastly, predictions are plotted against actual values:

xgb_las2016$testData %>%
    mutate(rndPred=round(2*predicted, 1)/2) %>%
    group_by(rndPred) %>%
    summarize(n=n(), meanPred=mean(binLAS)) %>%
    ggplot(aes(x=rndPred, y=meanPred)) + 
    geom_point(aes(size=n)) + 
    geom_abline(lty=2, color="red") + 
    geom_text(aes(y=meanPred+0.1, label=paste0((round(meanPred, 2)), "\n(n=", n, ")")), size=3) + 
    labs(x="Predicted Probability Las Vegas", y="Actual Proportion Las Vegas")

Broadly, there is a strong association between the predicted probabilities and the actual proportions. As noted previously, almost all of the predictions are correctly made with high confidence.

Next steps are to explore some trickier single-locale classifications and then to explore multi-locale classifications.

Suppose that every locale is run through the process of self vs. other, with other under-sampled to be the same size as self, using 500 rounds:

# Function to run one vs all using XGB
helperXGBOnevAll <- function(df, 
                             keyLoc, 
                             critFilter=vector("list", 0),
                             underSample=TRUE,
                             predVars=locXGBPreds, 
                             otherVars=keepVarFull, 
                             seed=NULL, 
                             nrounds=500, 
                             print_every_n=100, 
                             xgbObjective="binary:logistic", 
                             ...
                             ) {
    
    # FUNCTION ARGUMENTS:
    # df: the data frame or tibble
    # keyLoc: the value of locNamefct to use as the 'one' for 'one' vs. all-other
    # critFilter: named list, of format name=values, where filtering will be (get(name) %in% values)
    # underSample: boolean, if TRUE take 'all other' and randomly under-sample to be the same size as keyLoc
    # predVars: explanatory variables for modeling
    # otherVars: other variables to be kept, but not used in modeling
    # seed: the random seed (NULL means no seed)
    # nrounds: the number of XGB training rounds
    # print_every_n: the frequency of printing the XGB training error
    # xgbObjective: the objective function to be used in XGB modeling
    # ...: any other arguments to be passed (eventually) to xgboost::xgboost
    
    # Set the seed if it has been passed (drives consistency of under-sampling)
    if (!is.null(seed)) set.seed(seed)
    
    # Generate the descriptive name for keyLoc
    descName <- str_replace(keyLoc, pattern=", \\w{2}$", replacement="")
    
    # Announce the locale being run
    cat("\n\n **************")
    cat("\nRunning for", keyLoc, "with decription", descName, "\n")
    
    # Filter for non-NA data across all of 'locNamefct', predVars, otherVars, names(critFilter) are included
    subDF <- df %>%
        filter_at(vars(all_of(c("locNamefct", predVars, otherVars, names(critFilter)))), 
                  all_vars(!is.na(.))
                  )
    
    # Filter such that only matches to critFilter are included
    for (xNum in seq_len(length(critFilter))) {
        subDF <- subDF %>%
            filter_at(vars(all_of(names(critFilter)[xNum])), ~. %in% critFilter[[xNum]])
    }

    # Create the variables needed for modeling
    subDF <- subDF %>%
        mutate(isKEY=factor(ifelse(locNamefct==keyLoc, descName, "All Other"), 
                            levels=c(descName, "All Other")
                            ), 
               nKEY=sum(isKEY==descName), 
               binKEY=ifelse(isKEY==descName, 1, 0)
               )
    
    # Extract a balanced sample of data matching keyLoc and all-other data, if underSample==TRUE
    if (isTRUE(underSample)) {
        subDF <- subDF %>%
            group_by(isKEY) %>%
            sample_n(min(nKEY)) %>% # not ideal coding, but it works
            ungroup()
    }

    # Run the function shell
    outData <- xgbRunModel(subDF, 
                           depVar="binKEY", 
                           predVars=predVars, 
                           otherVars=otherVars, 
                           seed=seed,
                           nrounds=nrounds,
                           print_every_n=print_every_n, 
                           xgbObjective=xgbObjective, 
                           ...
                           )
    
    # Calculate overall accuracy
    accOverall <- outData$testData %>%
        mutate(correct=round(predicted)==binKEY) %>%
        pull(correct) %>%
        mean()
    
    # Report on finishing
    cat("Finished processing", keyLoc, "with overall test set accuracy", round(accOverall, 3), "\n")
    
    # Return the list
    outData
    
}

The function is then run for Las Vegas, including all 2016 cities:

xgb_las2016_check <- helperXGBOnevAll(metarData, 
                                      keyLoc="Las Vegas, NV", 
                                      critFilter=list(year=2016), 
                                      seed=2008061322
                                      )
## 
## 
##  **************
## Running for Las Vegas, NV with decription Las Vegas 
## [1]  train-error:0.104506 
## [101]    train-error:0.007196 
## [201]    train-error:0.000409 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Las Vegas, NV with overall test set accuracy 0.965
# Report accuracy without Phoenix
xgb_las2016_check$testData %>%
    filter(locNamefct != "Phoenix, AZ") %>%
    mutate(correct=round(predicted)==binKEY) %>%
    pull(correct) %>%
    mean() %>%
    round(3)
## [1] 0.976

Overall test set accuracy as reported is 96.3%. Excluding Phoenix, which will often classify as Las Vegas and was excluded in the previous run, accuracy is roughly 97% as before.

Suppose that every locale is run, with the objective of finding cities that are most and least unique:

# Extract the locales that are available in 2016
locs2016 <- metarData %>%
    filter(year==2016) %>%
    pull(locNamefct) %>%
    as.character() %>%
    unique() %>%
    sort()

# Create a container to hold each of the results
xgbOnevAll <- vector("list", length=length(locs2016))
names(xgbOnevAll) <- locs2016

# Run through the XGB process
n <- 1
for (loc in locs2016) {

    # Run and save the XGB model
    xgbOnevAll[[n]] <- helperXGBOnevAll(metarData, 
                                        keyLoc=loc, 
                                        critFilter=list(year=2016), 
                                        seed=2008061340+n
                                        )
    
    # Index the counter
    n <- n + 1
    
}
## 
## 
##  **************
## Running for Atlanta, GA with decription Atlanta 
## [1]  train-error:0.323565 
## [101]    train-error:0.043099 
## [201]    train-error:0.013142 
## [301]    train-error:0.003020 
## [401]    train-error:0.000327 
## [500]    train-error:0.000082 
## Finished processing Atlanta, GA with overall test set accuracy 0.902 
## 
## 
##  **************
## Running for Boston, MA with decription Boston 
## [1]  train-error:0.312644 
## [101]    train-error:0.066854 
## [201]    train-error:0.020386 
## [301]    train-error:0.005447 
## [401]    train-error:0.001073 
## [500]    train-error:0.000083 
## Finished processing Boston, MA with overall test set accuracy 0.892 
## 
## 
##  **************
## Running for Chicago, IL with decription Chicago 
## [1]  train-error:0.386542 
## [101]    train-error:0.097308 
## [201]    train-error:0.042577 
## [301]    train-error:0.014356 
## [401]    train-error:0.005220 
## [500]    train-error:0.001223 
## Finished processing Chicago, IL with overall test set accuracy 0.798 
## 
## 
##  **************
## Running for Dallas, TX with decription Dallas 
## [1]  train-error:0.276697 
## [101]    train-error:0.035487 
## [201]    train-error:0.010466 
## [301]    train-error:0.003352 
## [401]    train-error:0.000572 
## [500]    train-error:0.000082 
## Finished processing Dallas, TX with overall test set accuracy 0.911 
## 
## 
##  **************
## Running for Denver, CO with decription Denver 
## [1]  train-error:0.149882 
## [101]    train-error:0.004329 
## [201]    train-error:0.000000 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Denver, CO with overall test set accuracy 0.972 
## 
## 
##  **************
## Running for Detroit, MI with decription Detroit 
## [1]  train-error:0.373577 
## [101]    train-error:0.093210 
## [201]    train-error:0.041363 
## [301]    train-error:0.017119 
## [401]    train-error:0.006225 
## [500]    train-error:0.001884 
## Finished processing Detroit, MI with overall test set accuracy 0.802 
## 
## 
##  **************
## Running for Grand Rapids, MI with decription Grand Rapids 
## [1]  train-error:0.367789 
## [101]    train-error:0.090127 
## [201]    train-error:0.036994 
## [301]    train-error:0.013573 
## [401]    train-error:0.003890 
## [500]    train-error:0.001241 
## Finished processing Grand Rapids, MI with overall test set accuracy 0.795 
## 
## 
##  **************
## Running for Green Bay, WI with decription Green Bay 
## [1]  train-error:0.299343 
## [101]    train-error:0.054807 
## [201]    train-error:0.019638 
## [301]    train-error:0.006409 
## [401]    train-error:0.001397 
## [500]    train-error:0.000493 
## Finished processing Green Bay, WI with overall test set accuracy 0.854 
## 
## 
##  **************
## Running for Houston, TX with decription Houston 
## [1]  train-error:0.222458 
## [101]    train-error:0.045521 
## [201]    train-error:0.014384 
## [301]    train-error:0.003432 
## [401]    train-error:0.001144 
## [500]    train-error:0.000327 
## Finished processing Houston, TX with overall test set accuracy 0.913 
## 
## 
##  **************
## Running for Indianapolis, IN with decription Indianapolis 
## [1]  train-error:0.375595 
## [101]    train-error:0.089470 
## [201]    train-error:0.030999 
## [301]    train-error:0.011481 
## [401]    train-error:0.003280 
## [500]    train-error:0.000656 
## Finished processing Indianapolis, IN with overall test set accuracy 0.809 
## 
## 
##  **************
## Running for Las Vegas, NV with decription Las Vegas 
## [1]  train-error:0.098127 
## [101]    train-error:0.006215 
## [201]    train-error:0.000327 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Las Vegas, NV with overall test set accuracy 0.963 
## 
## 
##  **************
## Running for Lincoln, NE with decription Lincoln 
## [1]  train-error:0.287360 
## [101]    train-error:0.060462 
## [201]    train-error:0.016423 
## [301]    train-error:0.004821 
## [401]    train-error:0.001307 
## [500]    train-error:0.000082 
## Finished processing Lincoln, NE with overall test set accuracy 0.869 
## 
## 
##  **************
## Running for Los Angeles, CA with decription Los Angeles 
## [1]  train-error:0.171377 
## [101]    train-error:0.026385 
## [201]    train-error:0.007678 
## [301]    train-error:0.001960 
## [401]    train-error:0.000327 
## [500]    train-error:0.000082 
## Finished processing Los Angeles, CA with overall test set accuracy 0.934 
## 
## 
##  **************
## Running for Madison, WI with decription Madison 
## [1]  train-error:0.365856 
## [101]    train-error:0.082205 
## [201]    train-error:0.032716 
## [301]    train-error:0.013369 
## [401]    train-error:0.004069 
## [500]    train-error:0.001661 
## Finished processing Madison, WI with overall test set accuracy 0.809 
## 
## 
##  **************
## Running for Miami, FL with decription Miami 
## [1]  train-error:0.125469 
## [101]    train-error:0.014449 
## [201]    train-error:0.002449 
## [301]    train-error:0.000163 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Miami, FL with overall test set accuracy 0.96 
## 
## 
##  **************
## Running for Milwaukee, WI with decription Milwaukee 
## [1]  train-error:0.361758 
## [101]    train-error:0.089766 
## [201]    train-error:0.044025 
## [301]    train-error:0.017643 
## [401]    train-error:0.006453 
## [500]    train-error:0.001552 
## Finished processing Milwaukee, WI with overall test set accuracy 0.798 
## 
## 
##  **************
## Running for Minneapolis, MN with decription Minneapolis 
## [1]  train-error:0.354010 
## [101]    train-error:0.072302 
## [201]    train-error:0.027796 
## [301]    train-error:0.008559 
## [401]    train-error:0.001549 
## [500]    train-error:0.000245 
## Finished processing Minneapolis, MN with overall test set accuracy 0.842 
## 
## 
##  **************
## Running for New Orleans, LA with decription New Orleans 
## [1]  train-error:0.202465 
## [101]    train-error:0.008813 
## [201]    train-error:0.002040 
## [301]    train-error:0.000245 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing New Orleans, LA with overall test set accuracy 0.967 
## 
## 
##  **************
## Running for Newark, NJ with decription Newark 
## [1]  train-error:0.327080 
## [101]    train-error:0.072186 
## [201]    train-error:0.031403 
## [301]    train-error:0.011664 
## [401]    train-error:0.002855 
## [500]    train-error:0.000571 
## Finished processing Newark, NJ with overall test set accuracy 0.85 
## 
## 
##  **************
## Running for Philadelphia, PA with decription Philadelphia 
## [1]  train-error:0.363911 
## [101]    train-error:0.072326 
## [201]    train-error:0.032208 
## [301]    train-error:0.012720 
## [401]    train-error:0.004077 
## [500]    train-error:0.001631 
## Finished processing Philadelphia, PA with overall test set accuracy 0.844 
## 
## 
##  **************
## Running for Phoenix, AZ with decription Phoenix 
## [1]  train-error:0.109726 
## [101]    train-error:0.006069 
## [201]    train-error:0.000902 
## [301]    train-error:0.000164 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Phoenix, AZ with overall test set accuracy 0.965 
## 
## 
##  **************
## Running for Saint Louis, MO with decription Saint Louis 
## [1]  train-error:0.379026 
## [101]    train-error:0.064009 
## [201]    train-error:0.020677 
## [301]    train-error:0.007332 
## [401]    train-error:0.002059 
## [500]    train-error:0.000165 
## Finished processing Saint Louis, MO with overall test set accuracy 0.864 
## 
## 
##  **************
## Running for San Antonio, TX with decription San Antonio 
## [1]  train-error:0.225978 
## [101]    train-error:0.001551 
## [201]    train-error:0.000000 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing San Antonio, TX with overall test set accuracy 0.982 
## 
## 
##  **************
## Running for San Diego, CA with decription San Diego 
## [1]  train-error:0.152268 
## [101]    train-error:0.014630 
## [201]    train-error:0.002289 
## [301]    train-error:0.000409 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing San Diego, CA with overall test set accuracy 0.949 
## 
## 
##  **************
## Running for San Francisco, CA with decription San Francisco 
## [1]  train-error:0.159878 
## [101]    train-error:0.021888 
## [201]    train-error:0.005678 
## [301]    train-error:0.001234 
## [401]    train-error:0.000165 
## [500]    train-error:0.000000 
## Finished processing San Francisco, CA with overall test set accuracy 0.929 
## 
## 
##  **************
## Running for San Jose, CA with decription San Jose 
## [1]  train-error:0.159951 
## [101]    train-error:0.035644 
## [201]    train-error:0.011501 
## [301]    train-error:0.003344 
## [401]    train-error:0.000489 
## [500]    train-error:0.000163 
## Finished processing San Jose, CA with overall test set accuracy 0.915 
## 
## 
##  **************
## Running for Seattle, WA with decription Seattle 
## [1]  train-error:0.216713 
## [101]    train-error:0.004166 
## [201]    train-error:0.000163 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Seattle, WA with overall test set accuracy 0.975 
## 
## 
##  **************
## Running for Tampa Bay, FL with decription Tampa Bay 
## [1]  train-error:0.184260 
## [101]    train-error:0.025553 
## [201]    train-error:0.007021 
## [301]    train-error:0.002123 
## [401]    train-error:0.000408 
## [500]    train-error:0.000000 
## Finished processing Tampa Bay, FL with overall test set accuracy 0.923 
## 
## 
##  **************
## Running for Traverse City, MI with decription Traverse City 
## [1]  train-error:0.277315 
## [101]    train-error:0.061163 
## [201]    train-error:0.021966 
## [301]    train-error:0.007921 
## [401]    train-error:0.002613 
## [500]    train-error:0.000327 
## Finished processing Traverse City, MI with overall test set accuracy 0.846 
## 
## 
##  **************
## Running for Washington, DC with decription Washington 
## [1]  train-error:0.335551 
## [101]    train-error:0.067258 
## [201]    train-error:0.026443 
## [301]    train-error:0.010512 
## [401]    train-error:0.002628 
## [500]    train-error:0.000575 
## Finished processing Washington, DC with overall test set accuracy 0.865

Overall accuracy by locale can then be assessed:

# Function to calculate and extract overall accuracy from the XGB list
helperXGBListAccuracyOverall <- function(lst) {
    
    lst[["testData"]] %>%
        select(binKEY, predicted) %>%
        mutate(correct=binKEY==round(predicted)) %>%
        pull(correct) %>%
        mean()
    
}

# Extract overall accuracy by locale and format as a tibble
accList <- sapply(xgbOnevAll, FUN=helperXGBListAccuracyOverall) %>% 
    as.data.frame() %>% 
    rename("accOverall"=".") %>% 
    rownames_to_column("locale") %>% 
    tibble::as_tibble()

# Plot the overall accuracy by locale
ggplot(accList, aes(x=fct_reorder(locale, accOverall), y=accOverall)) + 
    geom_col(fill="lightblue") + 
    geom_text(aes(y=accOverall+0.02, label=round(accOverall, 3)), hjust=0) + 
    coord_flip() + 
    labs(x="", y="Overall Accuracy", title="Accuracy of Classifying Locale vs. All Other")

Interestingly, San Antonio stands out as the most distinct of the cities in the analysis. Several of the locales used previously as distinct archetypes - New Orleans, Las Vegas, San Diego - also score reasonably high in this exercise. As expected, the cold-weather cities are the most difficult to distinguish, with classification success in the 80% range (null success 50% given deliberate under-sampling of all-other).

Accuracy can also be assessed as “classifying self correctly” and “classifying all-other correctly”:

# Function to calculate and extract overall accuracy from the XGB list
helperXGBListAccuracySubset <- function(lst) {
    
    lst[["testData"]] %>%
        select(binKEY, predicted) %>%
        mutate(correct=binKEY==round(predicted)) %>%
        group_by(binKEY) %>%
        summarize(acc=mean(correct))
    
}

# Extract accuracy by subset by locale
accListSubset <- map_dfr(xgbOnevAll, .f=helperXGBListAccuracySubset, .id="locale") %>% 
    mutate(type=factor(ifelse(binKEY==1, "Classifying Self", "Classifying All Other"), 
                       levels=c("Classifying Self", "Classifying All Other")
                       )
           )

# Plot the accuracies by locale, facetted by seld vs. all other
ggplot(accListSubset, aes(x=fct_reorder(locale, acc), y=acc)) + 
    geom_col(fill="lightblue") + 
    geom_text(aes(y=acc/2, label=round(acc, 3)), hjust=0) + 
    coord_flip() + 
    labs(x="", y="Accuracy", title="Accuracy of Classifying Locale vs. All Other") + 
    facet_wrap(~type)

Models are generally more successful in classifying self that in classifying all-other. Put differently, the model is more likely to incorrectly flag an All Other locale as Self than it is to incorrectly flag Self as All Other.

Errors in classifying All Other as Self can then be explored further:

# Function to calculate and extract accuracy by locName fct from the XGB list
helperXGBListErrorByTrueLocation <- function(lst) {
    
    lst[["testData"]] %>%
        select(locNamefct, binKEY, predicted) %>%
        mutate(correct=binKEY==round(predicted)) %>%
        filter(binKEY==0) %>%
        group_by(locNamefct) %>%
        summarize(acc=mean(correct))
    
}

# Extract by locale
accListTrueLocation <- map_dfr(xgbOnevAll, .f=helperXGBListErrorByTrueLocation, .id="modelLocale")

# Plot accuracy
accFactors <- accListTrueLocation %>% 
    group_by(modelLocale) %>% 
    summarize(meanAcc=mean(acc)) %>%
    arrange(meanAcc) %>%
    pull(modelLocale)

accListTrueLocation %>%
    ggplot(aes(x=factor(modelLocale, levels=accFactors), 
               y=factor(locNamefct, levels=accFactors)
               )
           ) + 
    geom_tile(aes(fill=1-acc)) + 
    geom_text(aes(label=paste0(round(100*(1-acc)), "%")),size=3) + 
    scale_x_discrete(position="top") + 
    labs(x="Modeling Locale", 
         y="True Locale", 
         title="Rate of Misclassifying True Locale as Modeling Locale"
         ) + 
    theme(axis.text.x=element_text(angle=90)) + 
    scale_fill_continuous("Error Rate", low="white", high="red")

When Seattle and Denver are just a small part of the deliberately under-sampled All Other class, they are routinely misclassified as being part of many other locales. When they are the full-sampled class, almost no other city is misclassified with any frequency as being them. This suggests that Seattle and Denver may each be distinct, but in a rather nuanced manner that requires significant training data volumes to learn.

Next steps are to explore some of the more challenging one-vs-one classifications suggested by this grid, then to extend the analysis to explore the multi-class classification capabilities of XGB.

Locale pairs that are difficult to classify are identified based on the preceding analysis. There are two types of “difficult to classify”:

  1. Generally difficult - A is often classified as B when the model is trained as B vs. all, while B is often classified as A when the model is trained as A vs. all
  2. Directionally difficult - A is often classified as B when the model is trained as B vs. all, while B is often correctly classified as “not A” when the model is trained as A vs. all

The file accListTrueLocation is merged to itself to identify these cases:

# Create the frame of mean error and mean difference in error for A vs all and B vs all
accSummary <- accListTrueLocation %>%
    mutate(x=factor(modelLocale), modelLocale=locNamefct, locNamefct=x) %>%
    select(modelLocale, locNamefct, acc1=acc) %>%
    inner_join(mutate(accListTrueLocation, modelLocale=factor(modelLocale))) %>%
    mutate(meanError=1-(acc1 + acc)/2, deltaError=abs(acc-acc1))
## Joining, by = c("modelLocale", "locNamefct")
# The file contains two (pragmatically identical) records for each city pair - once as A/B, once as B/A
# Keep only the records where modelLocale is "smaller" than locNamefct
accSummaryUnique <- accSummary %>%
    filter(pmin(as.character(modelLocale), as.character(locNamefct))==as.character(modelLocale))

# Highest mean error
highMeanError <- accSummaryUnique %>%
    arrange(-meanError) %>%
    filter(meanError >= 0.5)
highMeanError
## # A tibble: 29 x 6
##    modelLocale      locNamefct        acc1   acc meanError deltaError
##    <fct>            <fct>            <dbl> <dbl>     <dbl>      <dbl>
##  1 Newark, NJ       Philadelphia, PA 0.108 0.165     0.863     0.0564
##  2 Miami, FL        Tampa Bay, FL    0.231 0.338     0.715     0.106 
##  3 Philadelphia, PA Washington, DC   0.362 0.267     0.685     0.0958
##  4 Green Bay, WI    Madison, WI      0.205 0.466     0.664     0.260 
##  5 Detroit, MI      Grand Rapids, MI 0.348 0.333     0.659     0.0150
##  6 Las Vegas, NV    Phoenix, AZ      0.322 0.366     0.656     0.0440
##  7 Los Angeles, CA  San Diego, CA    0.408 0.294     0.649     0.114 
##  8 Chicago, IL      Milwaukee, WI    0.421 0.281     0.649     0.140 
##  9 Madison, WI      Milwaukee, WI    0.404 0.375     0.610     0.0293
## 10 Chicago, IL      Detroit, MI      0.436 0.349     0.607     0.0862
## # ... with 19 more rows
highMeanError %>% mutate(n=n()) %>% select_if(is.numeric) %>% summarize_all(mean)
## # A tibble: 1 x 5
##    acc1   acc meanError deltaError     n
##   <dbl> <dbl>     <dbl>      <dbl> <dbl>
## 1 0.430 0.388     0.591      0.111    29
# Highest delta error
highDeltaError <- accSummaryUnique %>%
    arrange(-deltaError) %>%
    filter(deltaError >= 0.2)
highDeltaError
## # A tibble: 22 x 6
##    modelLocale      locNamefct       acc1   acc meanError deltaError
##    <fct>            <fct>           <dbl> <dbl>     <dbl>      <dbl>
##  1 New Orleans, LA  Tampa Bay, FL   0.309 0.744     0.474      0.436
##  2 Houston, TX      New Orleans, LA 0.728 0.293     0.490      0.436
##  3 Dallas, TX       San Antonio, TX 0.842 0.474     0.342      0.368
##  4 Houston, TX      Miami, FL       0.763 0.412     0.412      0.351
##  5 San Jose, CA     Seattle, WA     1     0.670     0.165      0.330
##  6 Atlanta, GA      San Antonio, TX 0.938 0.622     0.220      0.316
##  7 Milwaukee, WI    Seattle, WA     0.933 0.617     0.225      0.316
##  8 Grand Rapids, MI Seattle, WA     0.955 0.667     0.189      0.288
##  9 Madison, WI      Seattle, WA     0.935 0.648     0.208      0.287
## 10 Saint Louis, MO  San Antonio, TX 0.862 0.598     0.270      0.264
## # ... with 12 more rows
highDeltaError %>% mutate(n=n()) %>% select_if(is.numeric) %>% summarize_all(mean)
## # A tibble: 1 x 5
##    acc1   acc meanError deltaError     n
##   <dbl> <dbl>     <dbl>      <dbl> <dbl>
## 1 0.716 0.582     0.351      0.274    22

There are 29 locale pairs where the average error rate is worse than a coin flip (mean error for these pairs in 60%). There are 18 locale pairs where the directional error rate differs by at least 20%.

Each of these groupings is run through the prediction process individually, with the goal of seeing how the error rates evolve when the pairs are compared individually. Function xgbOnevAll() is modified slightly so that there is no under-sampling if a parameter is passed as FALSE. This is due to the one vs. one comparisons being of the same size, and an attempt to under-sample potentially causing errors due to very small differences in METAR data capture (bad data) by locale:

# Create container for each pairing in highMeanError
# Name for modelLocale
highMeanErrorList <- vector("list", nrow(highMeanError))
names(highMeanErrorList) <- highMeanError$modelLocale

# Run model for each pairing in highMeanError
for (n in 1:nrow(highMeanError)) {
    
    # Extract the key locale and the other locale
    # Note that which locale is defined as key is arbitrary and unimportant since this is a full 1:1 comparison
    keyLoc <- as.character(highMeanError$modelLocale)[n]
    otherLoc <- as.character(highMeanError$locNamefct)[n]
    
    # Run XGB for 500 rounds using only two locales and 2016 data; do not under-sample 'all other'
    highMeanErrorList[[n]] <- helperXGBOnevAll(metarData, 
                                               keyLoc=keyLoc, 
                                               critFilter=list(year=2016, 
                                                               locNamefct=c(keyLoc, otherLoc)
                                                               ), 
                                               underSample=FALSE,
                                               seed=2008071346
                                               )
    
}
## 
## 
##  **************
## Running for Newark, NJ with decription Newark 
## [1]  train-error:0.411434 
## [101]    train-error:0.142880 
## [201]    train-error:0.064427 
## [301]    train-error:0.032376 
## [401]    train-error:0.012641 
## [500]    train-error:0.004241 
## Finished processing Newark, NJ with overall test set accuracy 0.696 
## 
## 
##  **************
## Running for Miami, FL with decription Miami 
## [1]  train-error:0.257245 
## [101]    train-error:0.072822 
## [201]    train-error:0.033880 
## [301]    train-error:0.014287 
## [401]    train-error:0.007021 
## [500]    train-error:0.002449 
## Finished processing Miami, FL with overall test set accuracy 0.854 
## 
## 
##  **************
## Running for Philadelphia, PA with decription Philadelphia 
## [1]  train-error:0.365027 
## [101]    train-error:0.105720 
## [201]    train-error:0.046068 
## [301]    train-error:0.020538 
## [401]    train-error:0.006628 
## [500]    train-error:0.002046 
## Finished processing Philadelphia, PA with overall test set accuracy 0.767 
## 
## 
##  **************
## Running for Green Bay, WI with decription Green Bay 
## [1]  train-error:0.375268 
## [101]    train-error:0.087229 
## [201]    train-error:0.033454 
## [301]    train-error:0.013630 
## [401]    train-error:0.004791 
## [500]    train-error:0.001817 
## Finished processing Green Bay, WI with overall test set accuracy 0.793 
## 
## 
##  **************
## Running for Detroit, MI with decription Detroit 
## [1]  train-error:0.410259 
## [101]    train-error:0.115100 
## [201]    train-error:0.050469 
## [301]    train-error:0.025029 
## [401]    train-error:0.009633 
## [500]    train-error:0.002717 
## Finished processing Detroit, MI with overall test set accuracy 0.751 
## 
## 
##  **************
## Running for Las Vegas, NV with decription Las Vegas 
## [1]  train-error:0.229201 
## [101]    train-error:0.020554 
## [201]    train-error:0.003685 
## [301]    train-error:0.000491 
## [401]    train-error:0.000082 
## [500]    train-error:0.000000 
## Finished processing Las Vegas, NV with overall test set accuracy 0.948 
## 
## 
##  **************
## Running for Los Angeles, CA with decription Los Angeles 
## [1]  train-error:0.246527 
## [101]    train-error:0.055646 
## [201]    train-error:0.023206 
## [301]    train-error:0.008416 
## [401]    train-error:0.002942 
## [500]    train-error:0.000735 
## Finished processing Los Angeles, CA with overall test set accuracy 0.876 
## 
## 
##  **************
## Running for Chicago, IL with decription Chicago 
## [1]  train-error:0.424537 
## [101]    train-error:0.124561 
## [201]    train-error:0.055179 
## [301]    train-error:0.024325 
## [401]    train-error:0.010040 
## [500]    train-error:0.004245 
## Finished processing Chicago, IL with overall test set accuracy 0.709 
## 
## 
##  **************
## Running for Madison, WI with decription Madison 
## [1]  train-error:0.366549 
## [101]    train-error:0.110187 
## [201]    train-error:0.048835 
## [301]    train-error:0.016635 
## [401]    train-error:0.005435 
## [500]    train-error:0.001729 
## Finished processing Madison, WI with overall test set accuracy 0.757 
## 
## 
##  **************
## Running for Chicago, IL with decription Chicago 
## [1]  train-error:0.381528 
## [101]    train-error:0.093339 
## [201]    train-error:0.044626 
## [301]    train-error:0.017736 
## [401]    train-error:0.004904 
## [500]    train-error:0.000981 
## Finished processing Chicago, IL with overall test set accuracy 0.783 
## 
## 
##  **************
## Running for Newark, NJ with decription Newark 
## [1]  train-error:0.358078 
## [101]    train-error:0.082665 
## [201]    train-error:0.034457 
## [301]    train-error:0.011049 
## [401]    train-error:0.004092 
## [500]    train-error:0.000900 
## Finished processing Newark, NJ with overall test set accuracy 0.82 
## 
## 
##  **************
## Running for Madison, WI with decription Madison 
## [1]  train-error:0.348650 
## [101]    train-error:0.084074 
## [201]    train-error:0.030138 
## [301]    train-error:0.009058 
## [401]    train-error:0.003870 
## [500]    train-error:0.001070 
## Finished processing Madison, WI with overall test set accuracy 0.829 
## 
## 
##  **************
## Running for Boston, MA with decription Boston 
## [1]  train-error:0.355678 
## [101]    train-error:0.077781 
## [201]    train-error:0.029291 
## [301]    train-error:0.010010 
## [401]    train-error:0.002543 
## [500]    train-error:0.000820 
## Finished processing Boston, MA with overall test set accuracy 0.816 
## 
## 
##  **************
## Running for Chicago, IL with decription Chicago 
## [1]  train-error:0.394594 
## [101]    train-error:0.109842 
## [201]    train-error:0.046172 
## [301]    train-error:0.017992 
## [401]    train-error:0.007148 
## [500]    train-error:0.002136 
## Finished processing Chicago, IL with overall test set accuracy 0.763 
## 
## 
##  **************
## Running for Green Bay, WI with decription Green Bay 
## [1]  train-error:0.367822 
## [101]    train-error:0.090440 
## [201]    train-error:0.036782 
## [301]    train-error:0.013025 
## [401]    train-error:0.003932 
## [500]    train-error:0.001720 
## Finished processing Green Bay, WI with overall test set accuracy 0.794 
## 
## 
##  **************
## Running for Detroit, MI with decription Detroit 
## [1]  train-error:0.373303 
## [101]    train-error:0.096025 
## [201]    train-error:0.041714 
## [301]    train-error:0.017177 
## [401]    train-error:0.007280 
## [500]    train-error:0.002454 
## Finished processing Detroit, MI with overall test set accuracy 0.815 
## 
## 
##  **************
## Running for Grand Rapids, MI with decription Grand Rapids 
## [1]  train-error:0.343531 
## [101]    train-error:0.079410 
## [201]    train-error:0.028037 
## [301]    train-error:0.007916 
## [401]    train-error:0.001484 
## [500]    train-error:0.000082 
## Finished processing Grand Rapids, MI with overall test set accuracy 0.827 
## 
## 
##  **************
## Running for Madison, WI with decription Madison 
## [1]  train-error:0.362537 
## [101]    train-error:0.096413 
## [201]    train-error:0.042530 
## [301]    train-error:0.014314 
## [401]    train-error:0.005512 
## [500]    train-error:0.001645 
## Finished processing Madison, WI with overall test set accuracy 0.797 
## 
## 
##  **************
## Running for Chicago, IL with decription Chicago 
## [1]  train-error:0.340219 
## [101]    train-error:0.092832 
## [201]    train-error:0.039832 
## [301]    train-error:0.016542 
## [401]    train-error:0.004691 
## [500]    train-error:0.001234 
## Finished processing Chicago, IL with overall test set accuracy 0.778 
## 
## 
##  **************
## Running for Chicago, IL with decription Chicago 
## [1]  train-error:0.396990 
## [101]    train-error:0.095281 
## [201]    train-error:0.035986 
## [301]    train-error:0.015621 
## [401]    train-error:0.005398 
## [500]    train-error:0.001636 
## Finished processing Chicago, IL with overall test set accuracy 0.815 
## 
## 
##  **************
## Running for Grand Rapids, MI with decription Grand Rapids 
## [1]  train-error:0.302754 
## [101]    train-error:0.085491 
## [201]    train-error:0.033457 
## [301]    train-error:0.011262 
## [401]    train-error:0.004768 
## [500]    train-error:0.001069 
## Finished processing Grand Rapids, MI with overall test set accuracy 0.828 
## 
## 
##  **************
## Running for Grand Rapids, MI with decription Grand Rapids 
## [1]  train-error:0.367164 
## [101]    train-error:0.066233 
## [201]    train-error:0.019689 
## [301]    train-error:0.006014 
## [401]    train-error:0.001236 
## [500]    train-error:0.000082 
## Finished processing Grand Rapids, MI with overall test set accuracy 0.854 
## 
## 
##  **************
## Running for Boston, MA with decription Boston 
## [1]  train-error:0.341920 
## [101]    train-error:0.072518 
## [201]    train-error:0.026333 
## [301]    train-error:0.010090 
## [401]    train-error:0.002543 
## [500]    train-error:0.000246 
## Finished processing Boston, MA with overall test set accuracy 0.845 
## 
## 
##  **************
## Running for Houston, TX with decription Houston 
## [1]  train-error:0.273485 
## [101]    train-error:0.053341 
## [201]    train-error:0.021157 
## [301]    train-error:0.006372 
## [401]    train-error:0.002287 
## [500]    train-error:0.000327 
## Finished processing Houston, TX with overall test set accuracy 0.882 
## 
## 
##  **************
## Running for Detroit, MI with decription Detroit 
## [1]  train-error:0.348135 
## [101]    train-error:0.070412 
## [201]    train-error:0.025352 
## [301]    train-error:0.006624 
## [401]    train-error:0.001717 
## [500]    train-error:0.000327 
## Finished processing Detroit, MI with overall test set accuracy 0.845 
## 
## 
##  **************
## Running for Grand Rapids, MI with decription Grand Rapids 
## [1]  train-error:0.368565 
## [101]    train-error:0.095416 
## [201]    train-error:0.040040 
## [301]    train-error:0.013844 
## [401]    train-error:0.005305 
## [500]    train-error:0.001492 
## Finished processing Grand Rapids, MI with overall test set accuracy 0.798 
## 
## 
##  **************
## Running for Detroit, MI with decription Detroit 
## [1]  train-error:0.362400 
## [101]    train-error:0.073103 
## [201]    train-error:0.027618 
## [301]    train-error:0.008933 
## [401]    train-error:0.002868 
## [500]    train-error:0.000574 
## Finished processing Detroit, MI with overall test set accuracy 0.831 
## 
## 
##  **************
## Running for Milwaukee, WI with decription Milwaukee 
## [1]  train-error:0.407719 
## [101]    train-error:0.082409 
## [201]    train-error:0.028394 
## [301]    train-error:0.011668 
## [401]    train-error:0.003753 
## [500]    train-error:0.000653 
## Finished processing Milwaukee, WI with overall test set accuracy 0.837 
## 
## 
##  **************
## Running for Grand Rapids, MI with decription Grand Rapids 
## [1]  train-error:0.403437 
## [101]    train-error:0.116090 
## [201]    train-error:0.044808 
## [301]    train-error:0.017101 
## [401]    train-error:0.007564 
## [500]    train-error:0.002713 
## Finished processing Grand Rapids, MI with overall test set accuracy 0.772

The error rates are then extracted (overall and by subtype:

# Helper function to extract one vs one accuracy
helperExtractOnevOneAccuracy <- function(lst) {
    
    # Create accuracy for each sub-group and overall
    rawSummary <- lst[["testData"]] %>%
        count(locNamefct, correct=round(predicted)==binKEY) %>%
        mutate(locNamefct=as.character(locNamefct)) %>%
        bind_rows(mutate(., locNamefct="Overall")) %>%
        group_by(locNamefct) %>%
        summarize(nTotal=sum(n), nCorrect=sum(n*(correct==TRUE))) %>%
        ungroup() %>%
        mutate(acc=nCorrect/nTotal)
    
    # Create one-row tibble with locA, locB, accA, accB, accOverall
    locs=rawSummary$locNamefct %>% setdiff("Overall")
    tibble::tibble(locA=min(locs), 
                   locB=max(locs), 
                   accA=rawSummary %>% filter(locNamefct==locA) %>% pull(acc),
                   accB=rawSummary %>% filter(locNamefct==locB) %>% pull(acc), 
                   accOverall=rawSummary %>% filter(locNamefct=="Overall") %>% pull(acc)
                   )
    
}

# Extract accuracy by subset by locale
accHighMeanError <- map_dfr(highMeanErrorList, .f=helperExtractOnevOneAccuracy)

# Combine with the original file, highMeanError
highMeanOutput <- highMeanError %>%
    mutate(locA=as.character(modelLocale), 
           locB=as.character(locNamefct), 
           original=1-meanError
           ) %>%
    select(locA, locB, original) %>%
    inner_join(accHighMeanError, by=c("locA", "locB")) %>%
    select(-accA, -accB, new=accOverall) %>%
    pivot_longer(-c(locA, locB), names_to="model", values_to="accuracy") %>%
    mutate(desc=paste0(locA, " vs. ", locB)) 

# Plot the accuracy data
highMeanOutput %>%
    ggplot(aes(x=fct_reorder(desc, accuracy), y=accuracy, color=model)) + 
    geom_point() + 
    geom_text(aes(y=accuracy-0.02, label=round(accuracy, 2)), hjust=1, size=4) +
    coord_flip() + 
    labs(x="", y="Accuracy", title="Change in Accuracy - Original One vs. All, New One vs, One") + 
    geom_hline(aes(yintercept=0.5), lty=2) + 
    ylim(c(0, 1))

# Plot the change in accuracy data
highMeanOutput %>%
    group_by(desc) %>%
    summarize(accuracyGain=max(accuracy)-min(accuracy)) %>%
    ggplot(aes(x=fct_reorder(desc, accuracyGain), y=accuracyGain)) + 
    geom_col(fill="lightblue") + 
    geom_text(aes(y=accuracyGain+0.02, label=round(accuracyGain, 2)), hjust=0) + 
    coord_flip() + 
    labs(x="", y="Gain in Accuracy", title="Gain in Accuracy (One vs One compared to One vs. All")

# Check the highest delta error records
accHighMeanError %>%
    mutate(deltaAccuracy=abs(accA-accB)) %>%
    arrange(-deltaAccuracy)
## # A tibble: 29 x 6
##    locA             locB               accA  accB accOverall deltaAccuracy
##    <chr>            <chr>             <dbl> <dbl>      <dbl>         <dbl>
##  1 Madison, WI      Traverse City, MI 0.809 0.849      0.829        0.0407
##  2 Chicago, IL      Madison, WI       0.761 0.795      0.778        0.0341
##  3 Milwaukee, WI    Minneapolis, MN   0.851 0.823      0.837        0.0282
##  4 Philadelphia, PA Washington, DC    0.755 0.779      0.767        0.0232
##  5 Detroit, MI      Traverse City, MI 0.835 0.856      0.845        0.0215
##  6 Newark, NJ       Philadelphia, PA  0.686 0.706      0.696        0.0198
##  7 Grand Rapids, MI Madison, WI       0.789 0.807      0.798        0.0181
##  8 Green Bay, WI    Milwaukee, WI     0.786 0.803      0.794        0.0177
##  9 Chicago, IL      Grand Rapids, MI  0.771 0.754      0.763        0.0164
## 10 Chicago, IL      Indianapolis, IN  0.808 0.823      0.815        0.0155
## # ... with 19 more rows

Accuracy gains are significant, with many comparisons gaining ~40% accuracy when trained one vs. one rather than being just a small component of a one vs. all training. Gains are especially notable for Las Vegas/Phoenix, which soars from 36% accuracy (worse than null) to 95% accuracy. Large gains are also noted for Miami/Tampa and Los Angeles/San Diego, suggesting that a focused one-on-one training can help pull apart similar archetype cities.

That said, there are still a large number of pairings where the differentiation appears to be OK (accuracy around 80% vs. null accuracy 50%) but still with meaningful confusion.

There is little delta error whether looking at A/B or B/A, not surprising given the balanced nature of the data used for modeling and the generally small delta error for these pairings even in the previous one vs all.

The process is run again for the high delta-error pairings:

# Create container for each pairing in highDeltaError
# Name for modelLocale
highDeltaErrorList <- vector("list", nrow(highDeltaError))
names(highDeltaErrorList) <- highDeltaError$modelLocale

# Run model for each pairing in highDeltaError
for (n in 1:nrow(highDeltaError)) {
    
    # Extract the key locale and the other locale
    # Note that which locale is defined as key is arbitrary and unimportant since this is a full 1:1 comparison
    keyLoc <- as.character(highDeltaError$modelLocale)[n]
    otherLoc <- as.character(highDeltaError$locNamefct)[n]
    
    # Run XGB for 500 rounds using only two locales and 2016 data; do not under-sample 'all other'
    highDeltaErrorList[[n]] <- helperXGBOnevAll(metarData, 
                                                keyLoc=keyLoc, 
                                                critFilter=list(year=2016, 
                                                                locNamefct=c(keyLoc, otherLoc)
                                                                ), 
                                                underSample=FALSE,
                                                seed=2008071439
                                                )
    
}
## 
## 
##  **************
## Running for New Orleans, LA with decription New Orleans 
## [1]  train-error:0.305934 
## [101]    train-error:0.007754 
## [201]    train-error:0.001551 
## [301]    train-error:0.000082 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing New Orleans, LA with overall test set accuracy 0.976 
## 
## 
##  **************
## Running for Houston, TX with decription Houston 
## [1]  train-error:0.350674 
## [101]    train-error:0.016660 
## [201]    train-error:0.003838 
## [301]    train-error:0.000898 
## [401]    train-error:0.000163 
## [500]    train-error:0.000000 
## Finished processing Houston, TX with overall test set accuracy 0.95 
## 
## 
##  **************
## Running for Dallas, TX with decription Dallas 
## [1]  train-error:0.302778 
## [101]    train-error:0.000408 
## [201]    train-error:0.000000 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Dallas, TX with overall test set accuracy 0.995 
## 
## 
##  **************
## Running for Houston, TX with decription Houston 
## [1]  train-error:0.220534 
## [101]    train-error:0.033407 
## [201]    train-error:0.011027 
## [301]    train-error:0.003185 
## [401]    train-error:0.000490 
## [500]    train-error:0.000082 
## Finished processing Houston, TX with overall test set accuracy 0.923 
## 
## 
##  **************
## Running for San Jose, CA with decription San Jose 
## [1]  train-error:0.172721 
## [101]    train-error:0.000082 
## [201]    train-error:0.000000 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing San Jose, CA with overall test set accuracy 0.996 
## 
## 
##  **************
## Running for Atlanta, GA with decription Atlanta 
## [1]  train-error:0.226776 
## [101]    train-error:0.000327 
## [201]    train-error:0.000000 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Atlanta, GA with overall test set accuracy 0.996 
## 
## 
##  **************
## Running for Milwaukee, WI with decription Milwaukee 
## [1]  train-error:0.185346 
## [101]    train-error:0.002859 
## [201]    train-error:0.000000 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Milwaukee, WI with overall test set accuracy 0.985 
## 
## 
##  **************
## Running for Grand Rapids, MI with decription Grand Rapids 
## [1]  train-error:0.196185 
## [101]    train-error:0.002302 
## [201]    train-error:0.000000 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Grand Rapids, MI with overall test set accuracy 0.984 
## 
## 
##  **************
## Running for Madison, WI with decription Madison 
## [1]  train-error:0.177730 
## [101]    train-error:0.003871 
## [201]    train-error:0.000082 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Madison, WI with overall test set accuracy 0.982 
## 
## 
##  **************
## Running for Saint Louis, MO with decription Saint Louis 
## [1]  train-error:0.279400 
## [101]    train-error:0.001312 
## [201]    train-error:0.000000 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Saint Louis, MO with overall test set accuracy 0.99 
## 
## 
##  **************
## Running for Green Bay, WI with decription Green Bay 
## [1]  train-error:0.372873 
## [101]    train-error:0.092268 
## [201]    train-error:0.035602 
## [301]    train-error:0.013051 
## [401]    train-error:0.003635 
## [500]    train-error:0.000991 
## Finished processing Green Bay, WI with overall test set accuracy 0.791 
## 
## 
##  **************
## Running for Grand Rapids, MI with decription Grand Rapids 
## [1]  train-error:0.341717 
## [101]    train-error:0.077925 
## [201]    train-error:0.023501 
## [301]    train-error:0.007751 
## [401]    train-error:0.002391 
## [500]    train-error:0.000412 
## Finished processing Grand Rapids, MI with overall test set accuracy 0.826 
## 
## 
##  **************
## Running for Atlanta, GA with decription Atlanta 
## [1]  train-error:0.285691 
## [101]    train-error:0.020699 
## [201]    train-error:0.002863 
## [301]    train-error:0.000164 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Atlanta, GA with overall test set accuracy 0.937 
## 
## 
##  **************
## Running for Chicago, IL with decription Chicago 
## [1]  train-error:0.296744 
## [101]    train-error:0.060230 
## [201]    train-error:0.016404 
## [301]    train-error:0.003754 
## [401]    train-error:0.000979 
## [500]    train-error:0.000082 
## Finished processing Chicago, IL with overall test set accuracy 0.862 
## 
## 
##  **************
## Running for Green Bay, WI with decription Green Bay 
## [1]  train-error:0.154924 
## [101]    train-error:0.006063 
## [201]    train-error:0.000164 
## [301]    train-error:0.000000 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Green Bay, WI with overall test set accuracy 0.975 
## 
## 
##  **************
## Running for Milwaukee, WI with decription Milwaukee 
## [1]  train-error:0.293287 
## [101]    train-error:0.070974 
## [201]    train-error:0.025237 
## [301]    train-error:0.008821 
## [401]    train-error:0.001797 
## [500]    train-error:0.000245 
## Finished processing Milwaukee, WI with overall test set accuracy 0.835 
## 
## 
##  **************
## Running for Atlanta, GA with decription Atlanta 
## [1]  train-error:0.324559 
## [101]    train-error:0.046494 
## [201]    train-error:0.014432 
## [301]    train-error:0.004100 
## [401]    train-error:0.001148 
## [500]    train-error:0.000410 
## Finished processing Atlanta, GA with overall test set accuracy 0.892 
## 
## 
##  **************
## Running for Indianapolis, IN with decription Indianapolis 
## [1]  train-error:0.343717 
## [101]    train-error:0.045049 
## [201]    train-error:0.011201 
## [301]    train-error:0.001308 
## [401]    train-error:0.000245 
## [500]    train-error:0.000000 
## Finished processing Indianapolis, IN with overall test set accuracy 0.896 
## 
## 
##  **************
## Running for Detroit, MI with decription Detroit 
## [1]  train-error:0.325319 
## [101]    train-error:0.077363 
## [201]    train-error:0.027723 
## [301]    train-error:0.008996 
## [401]    train-error:0.002290 
## [500]    train-error:0.000491 
## Finished processing Detroit, MI with overall test set accuracy 0.843 
## 
## 
##  **************
## Running for Miami, FL with decription Miami 
## [1]  train-error:0.265100 
## [101]    train-error:0.009631 
## [201]    train-error:0.002367 
## [301]    train-error:0.000163 
## [401]    train-error:0.000000 
## [500]    train-error:0.000000 
## Finished processing Miami, FL with overall test set accuracy 0.967 
## 
## 
##  **************
## Running for San Diego, CA with decription San Diego 
## [1]  train-error:0.238916 
## [101]    train-error:0.021801 
## [201]    train-error:0.004573 
## [301]    train-error:0.000653 
## [401]    train-error:0.000082 
## [500]    train-error:0.000000 
## Finished processing San Diego, CA with overall test set accuracy 0.94 
## 
## 
##  **************
## Running for Boston, MA with decription Boston 
## [1]  train-error:0.328260 
## [101]    train-error:0.032003 
## [201]    train-error:0.005594 
## [301]    train-error:0.001070 
## [401]    train-error:0.000082 
## [500]    train-error:0.000000 
## Finished processing Boston, MA with overall test set accuracy 0.949

The comparisons can be run again also:

# Extract accuracy by subset by locale
accHighDeltaError <- map_dfr(highDeltaErrorList, .f=helperExtractOnevOneAccuracy)

# Combine with the original file, highMeanError
highDeltaOutput <- highDeltaError %>%
    mutate(locA=as.character(modelLocale), 
           locB=as.character(locNamefct), 
           original=1-meanError
           ) %>%
    select(locA, locB, original) %>%
    inner_join(accHighDeltaError, by=c("locA", "locB")) %>%
    select(-accA, -accB, new=accOverall) %>%
    pivot_longer(-c(locA, locB), names_to="model", values_to="accuracy") %>%
    mutate(desc=paste0(locA, " vs. ", locB)) 

# Plot the accuracy data
highDeltaOutput %>%
    ggplot(aes(x=fct_reorder(desc, accuracy), y=accuracy, color=model)) + 
    geom_point() + 
    geom_text(aes(y=accuracy-0.02, label=round(accuracy, 2)), hjust=1, size=4) +
    coord_flip() + 
    labs(x="", y="Accuracy", title="Change in Accuracy - Original One vs. All, New One vs, One") + 
    geom_hline(aes(yintercept=0.5), lty=2) + 
    ylim(c(0, 1))

# Plot the change in accuracy data
highDeltaOutput %>%
    group_by(desc) %>%
    summarize(accuracyGain=max(accuracy)-min(accuracy)) %>%
    ggplot(aes(x=fct_reorder(desc, accuracyGain), y=accuracyGain)) + 
    geom_col(fill="lightblue") + 
    geom_text(aes(y=accuracyGain+0.02, label=round(accuracyGain, 2)), hjust=0) + 
    coord_flip() + 
    labs(x="", y="Gain in Accuracy", title="Gain in Accuracy (One vs One compared to One vs. All")

# Check the highest delta error records
accHighDeltaError %>%
    mutate(deltaAccuracy=abs(accA-accB)) %>%
    arrange(-deltaAccuracy)
## # A tibble: 22 x 6
##    locA             locB               accA  accB accOverall deltaAccuracy
##    <chr>            <chr>             <dbl> <dbl>      <dbl>         <dbl>
##  1 Indianapolis, IN Minneapolis, MN   0.882 0.911      0.896        0.0293
##  2 Atlanta, GA      Saint Louis, MO   0.904 0.879      0.892        0.0251
##  3 Chicago, IL      Traverse City, MI 0.851 0.873      0.862        0.0221
##  4 Detroit, MI      Traverse City, MI 0.833 0.853      0.843        0.0199
##  5 Milwaukee, WI    Traverse City, MI 0.826 0.845      0.835        0.0189
##  6 Houston, TX      New Orleans, LA   0.944 0.957      0.950        0.0132
##  7 Green Bay, WI    Madison, WI       0.796 0.785      0.791        0.0112
##  8 Grand Rapids, MI Seattle, WA       0.978 0.989      0.984        0.0112
##  9 Green Bay, WI    Seattle, WA       0.969 0.980      0.975        0.0108
## 10 Boston, MA       Indianapolis, IN  0.954 0.943      0.949        0.0104
## # ... with 12 more rows

There is no longer any meaningful difference in the A/B and B/A accuracy for these pairings, and the overall accuracy has in almost all cases climbed in to the 90%+ range. This is suggestive that these pairings are largely distinct but need a large training samplt to tease out the distinctions. The one exception is Detroit vs. Traverse City, which is notably the only pairing that was also part of the “low overall accuracy” runs.

Next steps are to explore what predictions look like for unrelated cities using the one vs. one models. This may induce significant extrapolation errors, and is intended solely as an exploratory step. The next main model will be the multiclass version of the XGB, taking care to have sufficient training data volume in an attempt to meaningfully capture the accuracy gains seen in the preceding analysis.

Converting the XGB approach to work for multiclass classification will require the following:

  1. Convert the target variable to an integer that indexes from 0 to n-1 (where there are n classes); note that technically this works for n-2, since the XGB algorithm uses 0 and 1 for that
  2. Convert the objective to “multi:softprob”
  3. Convert the evaluation metric (though this should happen automatically)
  4. Convert the predictions back to a normal value

The function xgbRunModel() is updated to accept xgbObjective=“multi:softprob”, and then data are created for the four locales with 2014-2019 data - Chicago, Las Vegas, New Orleans, San Diego:

fourLocales <- c("Chicago, IL", "Las Vegas, NV", "New Orleans, LA", "San Diego, CA")

xgbFourLocalesCV <- xgbRunModel(filter(metarData, !is.na(TempF)), 
                                depVar="locNamefct", 
                                predVars=locXGBPreds, 
                                otherVars=keepVarFull, 
                                critFilter=list(year=2016, locNamefct=fourLocales), 
                                seed=2008081315,
                                nrounds=1000,
                                print_every_n=50,
                                xgbObjective="multi:softprob", 
                                funcRun=xgboost::xgb.cv,
                                nfold=5,
                                calcErr=FALSE,
                                num_class=length(fourLocales)
                                )
## [1]  train-merror:0.178692+0.001724  test-merror:0.191964+0.006761 
## [51] train-merror:0.019284+0.000676  test-merror:0.047940+0.003303 
## [101]    train-merror:0.003491+0.000273  test-merror:0.033607+0.002575 
## [151]    train-merror:0.000326+0.000083  test-merror:0.027727+0.002249 
## [201]    train-merror:0.000031+0.000025  test-merror:0.024909+0.001959 
## [251]    train-merror:0.000000+0.000000  test-merror:0.023643+0.001509 
## [301]    train-merror:0.000000+0.000000  test-merror:0.023031+0.001820 
## [351]    train-merror:0.000000+0.000000  test-merror:0.022541+0.001797 
## [401]    train-merror:0.000000+0.000000  test-merror:0.021847+0.001639 
## [451]    train-merror:0.000000+0.000000  test-merror:0.021642+0.001794 
## [501]    train-merror:0.000000+0.000000  test-merror:0.021234+0.001608 
## [551]    train-merror:0.000000+0.000000  test-merror:0.021071+0.001730 
## [601]    train-merror:0.000000+0.000000  test-merror:0.020948+0.001783 
## [651]    train-merror:0.000000+0.000000  test-merror:0.020621+0.001766 
## [701]    train-merror:0.000000+0.000000  test-merror:0.020744+0.001442 
## [751]    train-merror:0.000000+0.000000  test-merror:0.020581+0.001503 
## [801]    train-merror:0.000000+0.000000  test-merror:0.020458+0.001589 
## [851]    train-merror:0.000000+0.000000  test-merror:0.020377+0.001651 
## [901]    train-merror:0.000000+0.000000  test-merror:0.020621+0.001566 
## [951]    train-merror:0.000000+0.000000  test-merror:0.020662+0.001351 
## [1000]   train-merror:0.000000+0.000000  test-merror:0.020581+0.001313

Error evolution can then be plotted:

minTestError <- xgbFourLocalesCV$xgbModel$evaluation_log %>%
    pull(test_merror_mean) %>%
    min()

xgbFourLocalesCV$xgbModel$evaluation_log %>%
    filter(test_merror_mean <= 0.08) %>%
    ggplot(aes(x=iter, y=test_merror_mean)) +
    geom_line() +
    labs(x="# Iterations", 
         y="Test Error", 
         title="Evolution of Test Error", 
         subtitle="Filtered to Test Error <= 0.08"
         ) + 
    ylim(c(0, NA)) + 
    geom_vline(aes(xintercept=0), lty=2) + 
    geom_hline(aes(yintercept=minTestError), color="red", lty=2)

Test error appears to be near the minimum at around 750 iterations. The xgboost::xgboost algorithm is run with 750 iterations, and with predictions made for the best locale:

fourLocales <- c("Chicago, IL", "Las Vegas, NV", "New Orleans, LA", "San Diego, CA")

xgbFourLocales <- xgbRunModel(filter(metarData, !is.na(TempF)), 
                              depVar="locNamefct", 
                              predVars=locXGBPreds, 
                              otherVars=keepVarFull, 
                              critFilter=list(year=2016, locNamefct=fourLocales), 
                              seed=2008081315,
                              nrounds=750,
                              print_every_n=50,
                              xgbObjective="multi:softprob", 
                              funcRun=xgboost::xgboost,
                              calcErr=FALSE,
                              num_class=length(fourLocales)
                              )
## [1]  train-merror:0.180081 
## [51] train-merror:0.021112 
## [101]    train-merror:0.004614 
## [151]    train-merror:0.000449 
## [201]    train-merror:0.000041 
## [251]    train-merror:0.000000 
## [301]    train-merror:0.000000 
## [351]    train-merror:0.000000 
## [401]    train-merror:0.000000 
## [451]    train-merror:0.000000 
## [501]    train-merror:0.000000 
## [551]    train-merror:0.000000 
## [601]    train-merror:0.000000 
## [651]    train-merror:0.000000 
## [701]    train-merror:0.000000 
## [750]    train-merror:0.000000
## Warning: `as_tibble.matrix()` requires a matrix with column names or a `.name_repair` argument. Using compatibility `.name_repair`.
## This warning is displayed once per session.

Classification performance on test data can then be assessed:

# Overall success
xgbFourLocales$testData %>%
    summarize(mean(locNamefct==predicted))
## # A tibble: 1 x 1
##   `mean(locNamefct == predicted)`
##                             <dbl>
## 1                           0.985
# Histogram of predicted probabilities for selected class
xgbFourLocales$testData %>%
    mutate(correct=locNamefct==predicted) %>%
    ggplot() + 
    stat_count(aes(x=round(probPredicted, 2), y=..prop.., group=correct)) + 
    facet_wrap(~correct) + 
    labs(x="Probability Given to Prediction", 
         y="Proportion of Predictions", 
         title="Probability of Prediction vs. Accuracy of Prediction"
         )

# Confusion matrix
xgbFourLocales$testData %>%
    count(locNamefct, predicted) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ggplot(aes(x=predicted, y=locNamefct)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) + 
    scale_fill_continuous("", low="white", high="green") + 
    labs(title="Predicted vs. Actual Locale Frequency", y="Actual Locale", x="Predicted Locale")

# Find and plot importances
xgb_fourLocales_importance <- plotXGBImportance(xgbFourLocales, 
                                                featureStems=locXGBPreds, 
                                                stemMapper = varMapper, 
                                                plotTitle="Gain by variable in xgboost", 
                                                plotSubtitle="Modeling 2016 Locale (LAS, MSY, ORD, SAN)"
                                                )

Overall prediction accuracy is ~98%, with incorrect predictions having much lower probabilities than correct predictions. Predictions for every locale appear to be of about equal (and very high) accuracy. Dew point stands out as the best differentiato, with temperature, month, and sea-level pressure next highest in gain. This is consistent with previous analysis on these locales.

Overall, the base XGB parameters seem to be driving increased multiclass accuracy in a much shorter run time than random forest.

How well does the 2016 model predict data from 2014-2015 and 2017-2019 for these same locales?

# Extract data for the four locales, excluding 2016
non2016FourLocalesData <- metarData %>%
    filter(!is.na(TempF), year != 2016, locNamefct %in% fourLocales)

# Make the predictions
non2016FourLocalesPred <- non2016FourLocalesData %>%
    mutate_if(is.factor, .funs=fct_drop) %>%
    helperMakeSparse(predVars=locXGBPreds) %>%
    predict(xgbFourLocales$xgbModel, newdata=.)

# Create the prediction matrix
non2016FourLocalesMatrix <- matrix(data=non2016FourLocalesPred, 
                                   ncol=length(fourLocales), 
                                   nrow=nrow(non2016FourLocalesData), 
                                   byrow=TRUE
                                   )

# Get the predictions and probabilities, and add them to non2016FourLocalesData
maxCol <- apply(non2016FourLocalesMatrix, 1, FUN=which.max)
non2016FourLocalesData <- non2016FourLocalesData %>%
    mutate(predicted=xgbFourLocales$yTrainLevels[maxCol], 
           probPredicted=apply(non2016FourLocalesMatrix, 1, FUN=max), 
           correct=locNamefct==predicted
           )

# Assess overall prediction accuracy by year
helperNon2016Accuracy <- function(df, grpVar, grpLabel) {
    p1 <- df %>%
        group_by_at(grpVar) %>%
        summarize(pctCorrect=mean(correct)) %>%
        ggplot(aes(x=factor(get(grpVar)), y=pctCorrect)) + 
        geom_col(fill="lightblue") + 
        geom_text(aes(y=pctCorrect/2, label=paste0(round(100*pctCorrect), "%"))) +
        coord_flip() + 
        ylim(c(0, 1)) + 
        labs(x=grpLabel, y="Percent Correct", title="Accuracy of predictions to non-2016 data")
    print(p1)
}

helperNon2016Accuracy(non2016FourLocalesData, grpVar="year", grpLabel="Year")

helperNon2016Accuracy(non2016FourLocalesData, grpVar="locNamefct", grpLabel="Actual Locale")

helperNon2016Accuracy(non2016FourLocalesData, grpVar="month", grpLabel="Month")

helperNon2016Accuracy(non2016FourLocalesData, grpVar="hrfct", grpLabel="Hour (Zulu Time)")

# Confusion matrix
non2016FourLocalesData %>%
    count(locNamefct, predicted) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ggplot(aes(x=predicted, y=locNamefct)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) + 
    scale_fill_continuous("", low="white", high="green") + 
    labs(title="Predicted vs. Actual Locale Frequency", y="Actual Locale", x="Predicted Locale")

Prediction accuracy dips from ~98% on the 2016 test data to ~92% on the 2014-2015 and 2017-2019 data from the same locales. This suggests that while the model is learning significant features that differentiate these locales, it is also learning features specific to 2016.

Chicago and Las Vegas are better classified in years other than 2016, retaining ~95% accuracy. Predictions are generally better in the summer months than in the winter months, and there is no meaningful difference in accuracy by hour.

Next steps are to train the model on a larger tranche of data (2015-2018) to see if this can be better generalized to out-of-sample years (2014, 2019) for the same locales:

fourLocales <- c("Chicago, IL", "Las Vegas, NV", "New Orleans, LA", "San Diego, CA")

xgbFourLocalesCV_20152018 <- xgbRunModel(filter(metarData, !is.na(TempF)), 
                                         depVar="locNamefct", 
                                         predVars=locXGBPreds, 
                                         otherVars=keepVarFull, 
                                         critFilter=list(year=2015:2018, locNamefct=fourLocales), 
                                         seed=2008091301,
                                         nrounds=1000,
                                         print_every_n=50,
                                         xgbObjective="multi:softprob", 
                                         funcRun=xgboost::xgb.cv,
                                         nfold=5,
                                         calcErr=FALSE,
                                         num_class=length(fourLocales)
                                         )
## [1]  train-merror:0.213119+0.002119  test-merror:0.217464+0.003057 
## [51] train-merror:0.052677+0.000904  test-merror:0.069775+0.001018 
## [101]    train-merror:0.020469+0.000549  test-merror:0.041317+0.002381 
## [151]    train-merror:0.008479+0.000429  test-merror:0.028643+0.001862 
## [201]    train-merror:0.003949+0.000112  test-merror:0.024088+0.001306 
## [251]    train-merror:0.001940+0.000157  test-merror:0.021610+0.001605 
## [301]    train-merror:0.000934+0.000068  test-merror:0.020177+0.001172 
## [351]    train-merror:0.000422+0.000032  test-merror:0.019133+0.001343 
## [401]    train-merror:0.000207+0.000028  test-merror:0.018519+0.001087 
## [451]    train-merror:0.000092+0.000027  test-merror:0.018048+0.001045 
## [501]    train-merror:0.000031+0.000017  test-merror:0.017751+0.000815 
## [551]    train-merror:0.000013+0.000012  test-merror:0.017679+0.000737 
## [601]    train-merror:0.000008+0.000010  test-merror:0.017485+0.000653 
## [651]    train-merror:0.000000+0.000000  test-merror:0.017270+0.000610 
## [701]    train-merror:0.000000+0.000000  test-merror:0.017311+0.000621 
## [751]    train-merror:0.000000+0.000000  test-merror:0.017209+0.000785 
## [801]    train-merror:0.000000+0.000000  test-merror:0.017127+0.000729 
## [851]    train-merror:0.000000+0.000000  test-merror:0.017086+0.000828 
## [901]    train-merror:0.000000+0.000000  test-merror:0.017055+0.000811 
## [951]    train-merror:0.000000+0.000000  test-merror:0.017086+0.000821 
## [1000]   train-merror:0.000000+0.000000  test-merror:0.016911+0.000791

Results of the cross-validation can be assessed:

minTestError_20152018 <- xgbFourLocalesCV_20152018$xgbModel$evaluation_log %>%
    pull(test_merror_mean) %>%
    min()

xgbFourLocalesCV_20152018$xgbModel$evaluation_log %>%
    filter(test_merror_mean <= 0.08) %>%
    ggplot(aes(x=iter, y=test_merror_mean)) +
    geom_line() +
    labs(x="# Iterations", 
         y="Test Error", 
         title="Evolution of Test Error", 
         subtitle="Filtered to Test Error <= 0.08"
         ) + 
    ylim(c(0, NA)) + 
    geom_vline(aes(xintercept=0), lty=2) + 
    geom_hline(aes(yintercept=minTestError_20152018), color="red", lty=2)

The model using 2015-2018 data appears to drive slightly lower test RMSE than the 2016-only modeling. Test RMSE evolution appears to be stable by around 1000 iterations. The model is run for 1000 rounds:

fourLocales <- c("Chicago, IL", "Las Vegas, NV", "New Orleans, LA", "San Diego, CA")

xgbFourLocales_20152018 <- xgbRunModel(filter(metarData, !is.na(TempF)), 
                                       depVar="locNamefct", 
                                       predVars=locXGBPreds, 
                                       otherVars=keepVarFull, 
                                       critFilter=list(year=2015:2018, locNamefct=fourLocales), 
                                       seed=2008081315,
                                       nrounds=1000,
                                       print_every_n=50,
                                       xgbObjective="multi:softprob", 
                                       funcRun=xgboost::xgboost,
                                       calcErr=FALSE,
                                       num_class=length(fourLocales)
                                       )
## [1]  train-merror:0.210493 
## [51] train-merror:0.051513 
## [101]    train-merror:0.020843 
## [151]    train-merror:0.008927 
## [201]    train-merror:0.004842 
## [251]    train-merror:0.002518 
## [301]    train-merror:0.001351 
## [351]    train-merror:0.000758 
## [401]    train-merror:0.000369 
## [451]    train-merror:0.000174 
## [501]    train-merror:0.000082 
## [551]    train-merror:0.000051 
## [601]    train-merror:0.000020 
## [651]    train-merror:0.000000 
## [701]    train-merror:0.000000 
## [751]    train-merror:0.000000 
## [801]    train-merror:0.000000 
## [851]    train-merror:0.000000 
## [901]    train-merror:0.000000 
## [951]    train-merror:0.000000 
## [1000]   train-merror:0.000000
## Warning: `as_tibble.matrix()` requires a matrix with column names or a `.name_repair` argument. Using compatibility `.name_repair`.
## This warning is displayed once per session.

Model performance can be evaluated:

# Overall success
xgbFourLocales_20152018$testData %>%
    summarize(mean(locNamefct==predicted))
## # A tibble: 1 x 1
##   `mean(locNamefct == predicted)`
##                             <dbl>
## 1                           0.984
# Histogram of predicted probabilities for selected class
xgbFourLocales_20152018$testData %>%
    mutate(correct=locNamefct==predicted) %>%
    ggplot() + 
    stat_count(aes(x=round(probPredicted, 2), y=..prop.., group=correct)) + 
    facet_wrap(~correct) + 
    labs(x="Probability Given to Prediction", 
         y="Proportion of Predictions", 
         title="Probability of Prediction vs. Accuracy of Prediction"
         )

# Confusion matrix
xgbFourLocales_20152018$testData %>%
    count(locNamefct, predicted) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ggplot(aes(x=predicted, y=locNamefct)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) + 
    scale_fill_continuous("", low="white", high="green") + 
    labs(title="Predicted vs. Actual Locale Frequency", y="Actual Locale", x="Predicted Locale")

# Find and plot importances
xgb_fourLocales_importance_20152018 <- plotXGBImportance(xgbFourLocales_20152018, 
                                                         featureStems=locXGBPreds, 
                                                         stemMapper = varMapper, 
                                                         plotTitle="Gain by variable in xgboost", 
                                                         plotSubtitle="2015-2018 Locale (LAS, MSY, ORD, SAN)"
                                                         )

Accuracy and variable importances seem comparable to the previous model run using only 2016 data. Performance by dimension and the confusion matrix can be evaluated:

# Extract data for the four locales, excluding 2016
non20152018FourLocalesData <- metarData %>%
    filter(!is.na(TempF), !(year %in% 2015:2018), locNamefct %in% fourLocales)

# Make the predictions
non20152018FourLocalesPred <- non20152018FourLocalesData %>%
    mutate_if(is.factor, .funs=fct_drop) %>%
    helperMakeSparse(predVars=locXGBPreds) %>%
    predict(xgbFourLocales_20152018$xgbModel, newdata=.)

# Create the prediction matrix
non20152018FourLocalesMatrix <- matrix(data=non20152018FourLocalesPred, 
                                       ncol=length(fourLocales), 
                                       nrow=nrow(non20152018FourLocalesData), 
                                       byrow=TRUE
                                       )

# Get the predictions and probabilities, and add them to non2016FourLocalesData
maxCol <- apply(non20152018FourLocalesMatrix, 1, FUN=which.max)
non20152018FourLocalesData <- non20152018FourLocalesData %>%
    mutate(predicted=xgbFourLocales_20152018$yTrainLevels[maxCol], 
           probPredicted=apply(non20152018FourLocalesMatrix, 1, FUN=max), 
           correct=locNamefct==predicted
           )

# Assess overall prediction accuracy by year
helperNon2016Accuracy <- function(df, grpVar, grpLabel) {
    p1 <- df %>%
        group_by_at(grpVar) %>%
        summarize(pctCorrect=mean(correct)) %>%
        ggplot(aes(x=factor(get(grpVar)), y=pctCorrect)) + 
        geom_col(fill="lightblue") + 
        geom_text(aes(y=pctCorrect/2, label=paste0(round(100*pctCorrect), "%"))) +
        coord_flip() + 
        ylim(c(0, 1)) + 
        labs(x=grpLabel, y="Percent Correct", title="Accuracy of predictions to non-2016 data")
    print(p1)
}

helperNon2016Accuracy(non20152018FourLocalesData, grpVar="year", grpLabel="Year")

helperNon2016Accuracy(non20152018FourLocalesData, grpVar="locNamefct", grpLabel="Actual Locale")

helperNon2016Accuracy(non20152018FourLocalesData, grpVar="month", grpLabel="Month")

helperNon2016Accuracy(non20152018FourLocalesData, grpVar="hrfct", grpLabel="Hour (Zulu Time)")

# Confusion matrix
non20152018FourLocalesData %>%
    count(locNamefct, predicted) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ggplot(aes(x=predicted, y=locNamefct)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) + 
    scale_fill_continuous("", low="white", high="green") + 
    labs(title="Predicted vs. Actual Locale Frequency", y="Actual Locale", x="Predicted Locale")

Prediction quality for out-of-sample years increases meaningfully, from ~90% when the model is based only on 2016 data to ~95% when the model is based on 2015-2018 data. This suggests that modeling on multiple years helps train the model on recurring climate differences as opposed to sporadic local weather anomalies in a given year. chicago appears to almost always be predicted as itself, though each other locale predicts as Chicago on occasion.

Data for all other locales is then run through the model to see which of the four cities each of the other cities is “most like”:

# Extract data for the four locales, excluding 2016
otherLocalesData <- metarData %>%
    filter(!is.na(TempF), year == 2016, !(locNamefct %in% fourLocales))

# Make the predictions
otherLocalesPred <- otherLocalesData %>%
    mutate_if(is.factor, .funs=fct_drop) %>%
    helperMakeSparse(predVars=locXGBPreds) %>%
    predict(xgbFourLocales_20152018$xgbModel, newdata=.)

# Create the prediction matrix
otherLocalesMatrix <- matrix(data=otherLocalesPred, 
                                   ncol=length(fourLocales), 
                                   nrow=nrow(otherLocalesData), 
                                   byrow=TRUE
                                   )

# Get the predictions and probabilities, and add them to non2016FourLocalesData
maxCol <- apply(otherLocalesMatrix, 1, FUN=which.max)
otherLocalesData <- otherLocalesData %>%
    mutate(predicted=xgbFourLocales_20152018$yTrainLevels[maxCol], 
           probPredicted=apply(otherLocalesMatrix, 1, FUN=max), 
           correct=locNamefct==predicted
           )

# Confusion matrix
otherLocalesData %>%
    count(locNamefct, predicted) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n), pctChi=ifelse(predicted=="Chicago, IL", pct, 0)) %>%
    ggplot(aes(x=predicted, y=fct_reorder(locNamefct, pctChi, .fun=max))) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) + 
    scale_fill_continuous("", low="white", high="green") + 
    labs(title="Predicted vs. Actual Locale Frequency", y="Actual Locale", x="Predicted Locale")

Findings include:

  • Cold weather cities tend to be 90%+ assigned to Chicago
  • Phoenix is highly associated with Las Vegas, while somewhat surprisingly both Denver and San Antonio align primarily with Las Vegas also
  • Houston, Tampa, and Miami align primarily with New Orleans, with some association also to San Diego
  • Coastal California cities tend to align to San Diego and Chicago, with the proportion of Chicago-like days higher in northern California than in Los Angeles

Next steps are to consolidate the functions for data preparation, modeling (including CV), and prediction for XGB such that the process is better aligned with functional programming.

Broadly, the XGB modeling process includes several steps:

  1. Filtering an initial dataset so that all records meet certain criteria (key variables not missing, locale subsetted, empty factor levels dropped, etc.)
  2. Splitting the initial dataset in to test and train
  3. Separating a dataset in to response variable (recoded to 0 to n-1) and predictor variables (converted to sparse format with factors as dummy columns)
  4. Making predictions using a trained model
  5. Training a model and making predictions - can be xgboost::xgboost, xgboost::xgb.cv and use a user-specified objective function

Code from above is copied and adapted as needed for achieving these steps. The first function filters an initial data frame to return data that are ready for preprocessing and then modeling:

# Function to filter an initial data frame based on specified criteria
helperFilterData <- function(df, 
                             keepVars=NULL,
                             noNA=keepVars, 
                             critFilter=vector("list", 0), 
                             critFilterNot=vector("list", 0), 
                             dropEmptyLevels=TRUE
                             ) {
    
    # FUNCTION ARGUMENTS:
    # df: the data frame or tibble to filter
    # keepVars: the variables to be kept (NULL means keep all)
    # noNA: character string of variables that should have a "not NA" rule enforced (NULL means skip step)
    #       default is to apply noNA to every variable passed to keepVars
    # critFilter: filtering criteria passed as a named list (list(varName01=allowedValues01, ...))
    # critFilterNot: filtering criteria passed as a named list (list(varName01=disallowedValues01, ...))
    #                note that critFilterNot uses !disallowedValues01 while critFilter uses allowedValues01
    # dropEmptyLevels: whether to drop the empty levels of all factor variables
    
    # Remove the NA variables where passed as arguments
    if (!is.null(noNA)) {
        df <- df %>%
            filter_at(vars(all_of(noNA)), all_vars(!is.na(.)))
    }
    
    # Filter such that only matches to critFilter are included
    for (xNum in seq_len(length(critFilter))) {
        df <- df %>%
            filter_at(vars(all_of(names(critFilter)[xNum])), ~. %in% critFilter[[xNum]])
    }
    
    # Filter such that all matches to critFilterNot are excluded
    for (xNum in seq_len(length(critFilterNot))) {
        df <- df %>%
            filter_at(vars(all_of(names(critFilterNot)[xNum])), ~!(. %in% critFilterNot[[xNum]]))
    }
    
    # Keep only the requested variables if keepVars is not NULL
    if (!is.null(keepVars)) {
        df <- df %>%
            select_at(vars(all_of(keepVars)))
    }
    
    # Drop empty levels from factors if requested
    if (dropEmptyLevels) {
        df <- df %>%
            mutate_if(is.factor, .funs=fct_drop)
    }
    
    # Return the modified frame
    df
    
}

# Confirm that function does nothing to dataset with default parameters and dropEmptyLevels=FALSE
all.equal(metarData, helperFilterData(metarData, dropEmptyLevels=FALSE))
## [1] TRUE
# Example creation of a dataset for weather modeling
sampData <- helperFilterData(metarData, 
                             keepVars=c("source", "dtime", "TempF", "DewF", "month", "predomDir"), 
                             critFilter=list(year=2015, month=c("Jan", "Jun", "Jul", "Dec")), 
                             critFilterNot=list(predomDir=c("VRB", "000", "Error"), locNamefct="Las Vegas, NV")
                             ) %>%
    mutate(isMSY=source=="kmsy_2015")
sampData %>% count(source, isMSY)
## # A tibble: 3 x 3
##   source    isMSY     n
##   <chr>     <lgl> <int>
## 1 kmsy_2015 TRUE   2557
## 2 kord_2015 FALSE  2722
## 3 ksan_2015 FALSE  2027
sampData %>% count(month)
## # A tibble: 4 x 2
##   month     n
##   <fct> <int>
## 1 Jan    1747
## 2 Jun    1841
## 3 Jul    1887
## 4 Dec    1831
sampData %>% count(predomDir)
## # A tibble: 8 x 2
##   predomDir     n
##   <fct>     <int>
## 1 NE          758
## 2 E           427
## 3 SE          510
## 4 S          1189
## 5 SW         1025
## 6 W          1306
## 7 NW         1096
## 8 N           995
str(sampData)
## Classes 'tbl_df', 'tbl' and 'data.frame':    7306 obs. of  7 variables:
##  $ source   : chr  "kmsy_2015" "kmsy_2015" "kmsy_2015" "kmsy_2015" ...
##  $ dtime    : POSIXct, format: "2015-01-01 00:53:00" "2015-01-01 02:53:00" ...
##  $ TempF    : num  46 51.1 51.1 50 48.9 ...
##  $ DewF     : num  41 39.9 37.9 37 37.9 ...
##  $ month    : Factor w/ 4 levels "Jan","Jun","Jul",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ predomDir: Factor w/ 8 levels "NE","E","SE",..: 7 8 1 1 1 1 1 1 1 2 ...
##  $ isMSY    : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...

The second step is already achieved by function createTestTrain() which creates a list with elements ‘trainData’ and ‘testData’:

sampList <- createTestTrain(sampData)
sampList
## $trainData
## # A tibble: 5,114 x 7
##    source    dtime               TempF  DewF month predomDir isMSY
##    <chr>     <dttm>              <dbl> <dbl> <fct> <fct>     <lgl>
##  1 ksan_2015 2015-06-30 00:51:00 72.0   62.1 Jun   NW        FALSE
##  2 kord_2015 2015-01-25 23:51:00 26.1   17.1 Jan   N         FALSE
##  3 kord_2015 2015-01-08 01:51:00 -2.92 -14.1 Jan   W         FALSE
##  4 kmsy_2015 2015-12-13 04:53:00 70.0   64.0 Dec   SE        TRUE 
##  5 kord_2015 2015-06-07 16:51:00 70.0   64.9 Jun   S         FALSE
##  6 ksan_2015 2015-06-01 03:51:00 63.0   55.0 Jun   W         FALSE
##  7 kmsy_2015 2015-01-23 23:53:00 45.0   41   Jan   NW        TRUE 
##  8 kmsy_2015 2015-01-15 21:53:00 46.0   42.1 Jan   N         TRUE 
##  9 ksan_2015 2015-07-18 01:51:00 73.0   60.1 Jul   NW        FALSE
## 10 kmsy_2015 2015-06-01 15:53:00 82.0   69.1 Jun   NE        TRUE 
## # ... with 5,104 more rows
## 
## $testData
## # A tibble: 2,192 x 7
##    source    dtime               TempF  DewF month predomDir isMSY
##    <chr>     <dttm>              <dbl> <dbl> <fct> <fct>     <lgl>
##  1 kmsy_2015 2015-01-01 00:53:00  46.0  41   Jan   NW        TRUE 
##  2 kmsy_2015 2015-01-01 07:53:00  46.9  39.0 Jan   NE        TRUE 
##  3 kmsy_2015 2015-01-01 11:53:00  46.0  39.9 Jan   E         TRUE 
##  4 kmsy_2015 2015-01-01 15:53:00  51.1  46.0 Jan   NE        TRUE 
##  5 kmsy_2015 2015-01-01 17:53:00  53.1  48.0 Jan   NE        TRUE 
##  6 kmsy_2015 2015-01-01 18:53:00  55.9  51.1 Jan   E         TRUE 
##  7 kmsy_2015 2015-01-01 22:53:00  57.0  52.0 Jan   NE        TRUE 
##  8 kmsy_2015 2015-01-01 23:53:00  57.9  53.1 Jan   NE        TRUE 
##  9 kmsy_2015 2015-01-02 05:53:00  57.0  52.0 Jan   E         TRUE 
## 10 kmsy_2015 2015-01-02 11:53:00  61.0  57.0 Jan   E         TRUE 
## # ... with 2,182 more rows

The third function recodes data as needed so that the response variable is either a numeric or a factor that has been recoded as 0 to n-1, while the predictor variables are a sparse matrix where every column is either a numeric or a factor that has been converted to have its contrasts in each column:

# Recode the data as appropriate for XGB modeling
helperXGBRecode <- function(df, 
                            depVar, 
                            predVars, 
                            depIsFactor=FALSE
                            ) {
    
    # FUNCTION ARGUMENTS:
    # df: the data frame or tibble for processing
    # depVar: the dependent variable
    # predVars: the predictor variables
    # depIsFactor: whether the dependent variable should be treated as a factor (classification)
    #              NULL means set TRUE if !is.numeric(depVar)
    
    # Pull the dependent variable
    y <- df[, depVar, drop=TRUE]
    
    # Set depIsFactor automatically if it was passed as NULL
    if (is.null(depIsFactor)) depIsFactor <- !is.numeric(y)
    
    # Convert y to factor if flag is set
    if (depIsFactor) {
        # Convert y to a factor variable if it is not already of that class
        if (!is.factor(y)) y <- factor(y)
        # Store the factor levels so they can be reconstituted if desired
        yLevels <- levels(y)
        # Convert the dependent variable to integers running from 0 to n-1
        y <- as.integer(y) - 1
    } else {
        # Set yLevels to NULL if it is not relevant
        yLevels <- NULL
    }

    # Create the sparse matrix for 
    x <- df %>%
        select_at(vars(all_of(c(predVars)))) %>%
        Matrix::sparse.model.matrix(~ . -1, 
                                    data=., 
                                    contrasts.arg=lapply(.[, sapply(., is.factor)], contrasts, contrasts=FALSE)
                                    )
    
    # Return the key elements
    list(y=y, x=x, yLevels=yLevels)
    
}

# Create a factor modeling list for multi-class
sampTrainFactor <- helperXGBRecode(sampList$trainData, 
                                   depVar="source", 
                                   predVars=c("TempF", "DewF", "month", "predomDir"), 
                                   depIsFactor=TRUE
                                   )
str(sampTrainFactor)
## List of 3
##  $ y      : num [1:5114] 2 1 1 0 1 2 0 0 2 0 ...
##  $ x      :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
##   .. ..@ i       : int [1:20456] 0 1 2 3 4 5 6 7 8 9 ...
##   .. ..@ p       : int [1:15] 0 5114 10228 11457 12759 14060 15342 15854 16159 16519 ...
##   .. ..@ Dim     : int [1:2] 5114 14
##   .. ..@ Dimnames:List of 2
##   .. .. ..$ : chr [1:5114] "1" "2" "3" "4" ...
##   .. .. ..$ : chr [1:14] "TempF" "DewF" "monthJan" "monthJun" ...
##   .. ..@ x       : num [1:20456] 71.96 26.06 -2.92 69.98 69.98 ...
##   .. ..@ factors : list()
##  $ yLevels: chr [1:3] "kmsy_2015" "kord_2015" "ksan_2015"
# Create a factor modeling list for single-class
sampTrainSingle <- helperXGBRecode(sampList$trainData, 
                                   depVar="isMSY", 
                                   predVars=c("TempF", "DewF", "month", "predomDir"), 
                                   depIsFactor=TRUE
                                   )
str(sampTrainSingle)
## List of 3
##  $ y      : num [1:5114] 0 0 0 1 0 0 1 1 0 1 ...
##  $ x      :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
##   .. ..@ i       : int [1:20456] 0 1 2 3 4 5 6 7 8 9 ...
##   .. ..@ p       : int [1:15] 0 5114 10228 11457 12759 14060 15342 15854 16159 16519 ...
##   .. ..@ Dim     : int [1:2] 5114 14
##   .. ..@ Dimnames:List of 2
##   .. .. ..$ : chr [1:5114] "1" "2" "3" "4" ...
##   .. .. ..$ : chr [1:14] "TempF" "DewF" "monthJan" "monthJun" ...
##   .. ..@ x       : num [1:20456] 71.96 26.06 -2.92 69.98 69.98 ...
##   .. ..@ factors : list()
##  $ yLevels: chr [1:2] "FALSE" "TRUE"
# Create a numeric modeling list
sampTrainFactor <- helperXGBRecode(sampList$trainData, 
                                   depVar="TempF", 
                                   predVars=c("source", "DewF", "month", "predomDir"), 
                                   depIsFactor=FALSE
                                   )
str(sampTrainFactor)
## List of 3
##  $ y      : num [1:5114] 71.96 26.06 -2.92 69.98 69.98 ...
##  $ x      :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
##   .. ..@ i       : int [1:20456] 3 6 7 9 10 12 22 24 28 32 ...
##   .. ..@ p       : int [1:17] 0 1762 3671 5114 10228 11457 12759 14060 15342 15854 ...
##   .. ..@ Dim     : int [1:2] 5114 16
##   .. ..@ Dimnames:List of 2
##   .. .. ..$ : chr [1:5114] "1" "2" "3" "4" ...
##   .. .. ..$ : chr [1:16] "sourcekmsy_2015" "sourcekord_2015" "sourceksan_2015" "DewF" ...
##   .. ..@ x       : num [1:20456] 1 1 1 1 1 1 1 1 1 1 ...
##   .. ..@ factors : list()
##  $ yLevels: NULL

A function to make predictions using an XGB model is created:

# Function to make predictions given a trained XGB model and an aligned sparse matrix
helperXGBPredict <- function(mdl, 
                             dfSparse, 
                             objective,
                             probMatrix=FALSE, 
                             yLevels=NULL
                             ) {
    
    # FUNCTION ARGUMENTS:
    # mdl: the trained XGB model
    # dfSparse: a sparse data frame that matches the columns used in mdl
    # objective: the objective function that was used (drives the prediction approach)
    # probMatrix: boolean, whether to create a probability matrix by class
    # yLevels: if probMatrix is TRUE, yLevels should be passed (names associated to classes 0 through n-1)
    
    # Check that yLevels is passed so the probability matrix can be properly consituted
    if (probMatrix & is.null(yLevels)) {
        stop("\nMust pass the yLevels variable if a probability matrix is requested\n")
    }
    
    # Create a base probMatrix file that is NULL
    probData <- NULL

    # Create the testData tibble (and predData matrix if probMatrix=TRUE)
    # If logistic, the probability is the prediction and a tentative class can be set using 50%
    if (probMatrix) {
        # If logistic, the probability is the prediction and a tentative class can be set using 50%
        if (objective=="binary:logistic") {
            probData <- matrix(data=c(1-predict(mdl, newdata=dfSparse), predict(mdl, newdata=dfSparse)), 
                               nrow=nrow(dfSparse), 
                               ncol=length(yLevels), 
                               byrow=FALSE
                               )
        } else {
            probData <- matrix(data=predict(mdl, newdata=dfSparse), 
                               nrow=nrow(dfSparse), 
                               ncol=length(yLevels), 
                               byrow=TRUE
                               )
        }
        maxCol <- apply(probData, 1, FUN=which.max)
        predData <- tibble::tibble(predicted=factor(yLevels[maxCol], levels=yLevels), 
                                   probPredicted=apply(probData, 1, FUN=max)
                                   )
        probData <- probData %>%
            as_tibble() %>%
            purrr::set_names(yLevels)
    } else {
        predData <- tibble::tibble(predicted=predict(mdl, newdata=dfSparse))
    }
    
    # Return a list of predData and probData
    list(predData=predData, probData=probData)

}

The function xgbRunModel() is modified to call the helper functions just created:

# Run xgb model with desired parameters
xgbRunModel_002 <- function(tbl, 
                            depVar, 
                            predVars,
                            otherVars=c("source", "dtime"),
                            critFilter=vector("list", 0),
                            critFilterNot=vector("list", 0),
                            dropEmptyLevels=TRUE,
                            seed=NULL, 
                            nrounds=200, 
                            print_every_n=nrounds, 
                            testSize=0.3, 
                            xgbObjective="reg:squarederror",
                            funcRun=xgboost::xgboost,
                            calcErr=TRUE,
                            ...
                            ) {
    
    # FUNCTION ARGUMENTS:
    # tbl: the data frame or tibble
    # depVar: the dependent variable that will be predicted
    # predVars: explanatory variables for modeling
    # otherVars: other variables to be kept in a final testData file, but not used in modeling
    # critFilter: named list of format list(varName=c(varValues))
    #             will include only observations where get(varName) %in% varValues
    #             vector("list", 0) creates a length-zero list, which never runs in the for loop
    # critFilterNot: named list of format list(varName=c(varValues))
    #                same as critFilter operation, but will apply !(varName %in% varValues)
    # dropEmptyLevels: boolean, whether to run fct_drop on all variables of class factor after critFilter
    # seed: the random seed (NULL means no seed)
    # nrounds: the maximum number of boosting iterations
    # print_every_n: how frequently to print the progress of training error/accuracy while fitting
    # testSize: the fractional portion of data that should be used as the test dataset
    # xgbObjective: the objective function for xgboost
    # funcRun: the function to run, passed as a function
    # calcErr: boolean, whether to create variable err as predicted-get(depVar)
    # ...: additional arguments to be passed directly to xgboost
    
    # Check that funcName is valid and get the relevant function
    valFuncs <- c("xgboost", "xgb.cv")
    funcName <- as.character(substitute(funcRun))
    if (!(funcName[length(funcName)] %in% valFuncs)) {
        cat("\nFunction is currently only prepared for:", valFuncs, "\n")
        stop("Please change passed argument or update function\n")
    }
    
    # Check that the objective function is programmed
    valObjectiveFull <- c("reg:squarederror", "binary:logistic", "multi:softprob")
    valObjectiveCVOnly <- c("multi:softmax")  # This is not implemented in helperXGBPredict, so it is CV only
    if (!(xgbObjective %in% valObjectiveFull) & funcName[length(funcName)]=="xgboost") {
        cat("\nFunction runs xgboost with predict only for:", valObjectiveFull, "\n")
        stop("Please change passed argument or update function (need to also updated helperXGBPredict()\n")
    }
    if (!(xgbObjective %in% c(valObjectiveFull, valObjectiveCVOnly))) {
        cat("\nFunction runs xgb.cv only for:", valObjectiveFull, valObjectiveCVOnly, "\n")
        stop("Please change passed argument or update function (need to also updated helperXGBPredict()\n")
    }
    
    # Filter such that only matches to critFilter are included
    tbl <- helperFilterData(tbl, 
                            keepVars=c(depVar, predVars, otherVars), 
                            noNA=c(depVar, predVars), 
                            critFilter=critFilter, 
                            critFilterNot=critFilterNot,
                            dropEmptyLevels=dropEmptyLevels
                            )
    
    # Create test-train split
    ttLists <- createTestTrain(tbl, testSize=testSize, seed=seed)
    
    # Set the seed if requested
    if (!is.null(seed)) { set.seed(seed) }
    
    # Recode the training data and testing data
    # Treat (and convert) depVar as factor unless it is numeric (signified by NULL)
    recodeTrain <- helperXGBRecode(ttLists$trainData, 
                                   depVar=depVar, 
                                   predVars=predVars, 
                                   depIsFactor=NULL
                                   )
    recodeTest <- helperXGBRecode(ttLists$testData, 
                                  depVar=depVar, 
                                  predVars=predVars, 
                                  depIsFactor=NULL
                                  )
        
    # Pull the dependent variable, dependent variable levels, and sparse matrix x for training
    yTrain <- recodeTrain$y
    yTrainLevels <- recodeTrain$yLevels
    sparseTrain <- recodeTrain$x
    
    # Train model
    xgbModel <- funcRun(data=sparseTrain, 
                        label=yTrain, 
                        nrounds=nrounds, 
                        print_every_n=print_every_n, 
                        objective=xgbObjective, 
                        ...
                        )

    # Make predictions, including getting the probability matrix if yTrainLevels exists (yTrain is factor)
    # Run only if xgboost was passed (no predictions for CV)
    if (funcName[length(funcName)] %in% c("xgboost")) {
        # Make the predictions
        xgbPredsList <- helperXGBPredict(xgbModel, 
                                         dfSparse=recodeTest$x, 
                                         probMatrix=!is.null(yTrainLevels), 
                                         yLevels=yTrainLevels, 
                                         objective=xgbObjective
                                         )
    
        # Combine the xgbPredList output with existing testData file
        testData <- ttLists$testData %>%
            bind_cols(xgbPredsList$predData)
        
        # Extract probData
        probData <- xgbPredsList$probData
        
    } else {
        # Create NULL for testData and probData
        testData <- NULL
        probData <- NULL
    }
    
    # Return list containing funcName, trained model, and testData
    list(funcName=funcName[length(funcName)], 
         xgbModel=xgbModel, 
         testData=testData, 
         predData=probData,
         yTrainLevels=yTrainLevels
         )
    
}

The function is then tested on a numeric response variable:

# Define key predictor variables for base XGB runs
baseXGBPreds <- c("locNamefct", "month", "hrfct", 
                  "DewF", "modSLP", "Altimeter", "WindSpeed", 
                  "predomDir", "minHeight", "ceilingHeight"
                  )

# Core multi-year cities
multiYearLocales <- c("Las Vegas, NV", "New Orleans, LA", "Chicago, IL", "San Diego, CA")

# Run the function shell
xgbInit_002 <- xgbRunModel_002(metarData, 
                               depVar="TempF", 
                               predVars=baseXGBPreds, 
                               otherVars=c("source", "dtime"), 
                               critFilter=list(locNamefct=multiYearLocales),
                               seed=2008011825,
                               nrounds=2000,
                               print_every_n=50
                               )
## [1]  train-rmse:47.096767 
## [51] train-rmse:4.079394 
## [101]    train-rmse:3.584855 
## [151]    train-rmse:3.299586 
## [201]    train-rmse:3.106099 
## [251]    train-rmse:2.970594 
## [301]    train-rmse:2.866863 
## [351]    train-rmse:2.776774 
## [401]    train-rmse:2.697801 
## [451]    train-rmse:2.626064 
## [501]    train-rmse:2.572620 
## [551]    train-rmse:2.518776 
## [601]    train-rmse:2.469624 
## [651]    train-rmse:2.426183 
## [701]    train-rmse:2.390440 
## [751]    train-rmse:2.350134 
## [801]    train-rmse:2.315014 
## [851]    train-rmse:2.276004 
## [901]    train-rmse:2.244308 
## [951]    train-rmse:2.213070 
## [1001]   train-rmse:2.184253 
## [1051]   train-rmse:2.156494 
## [1101]   train-rmse:2.133281 
## [1151]   train-rmse:2.104960 
## [1201]   train-rmse:2.079930 
## [1251]   train-rmse:2.053286 
## [1301]   train-rmse:2.030751 
## [1351]   train-rmse:2.010247 
## [1401]   train-rmse:1.989779 
## [1451]   train-rmse:1.966085 
## [1501]   train-rmse:1.944863 
## [1551]   train-rmse:1.920303 
## [1601]   train-rmse:1.897431 
## [1651]   train-rmse:1.877888 
## [1701]   train-rmse:1.860289 
## [1751]   train-rmse:1.836947 
## [1801]   train-rmse:1.817471 
## [1851]   train-rmse:1.800320 
## [1901]   train-rmse:1.783760 
## [1951]   train-rmse:1.767167 
## [2000]   train-rmse:1.754318

The function is then tested on a single-class factor variable with the CV capability:

# Run the function shell
xgb_las2016_cv_002 <- xgbRunModel_002(las2016Data, 
                                      depVar="isLAS", 
                                      predVars=locXGBPreds, 
                                      otherVars=keepVarFull, 
                                      seed=2008051405,
                                      nrounds=1000,
                                      print_every_n=50, 
                                      xgbObjective="binary:logistic", 
                                      funcRun=xgboost::xgb.cv, 
                                      nfold=5
                                      )
## [1]  train-error:0.091361+0.003256   test-error:0.108349+0.004028 
## [51] train-error:0.010876+0.000528   test-error:0.040477+0.004260 
## [101]    train-error:0.002167+0.000357   test-error:0.034017+0.003774 
## [151]    train-error:0.000327+0.000119   test-error:0.031401+0.002359 
## [201]    train-error:0.000000+0.000000   test-error:0.030419+0.002076 
## [251]    train-error:0.000000+0.000000   test-error:0.029275+0.002566 
## [301]    train-error:0.000000+0.000000   test-error:0.028947+0.002057 
## [351]    train-error:0.000000+0.000000   test-error:0.028539+0.002153 
## [401]    train-error:0.000000+0.000000   test-error:0.028375+0.002605 
## [451]    train-error:0.000000+0.000000   test-error:0.027639+0.002433 
## [501]    train-error:0.000000+0.000000   test-error:0.027884+0.002740 
## [551]    train-error:0.000000+0.000000   test-error:0.028211+0.002492 
## [601]    train-error:0.000000+0.000000   test-error:0.028211+0.002783 
## [651]    train-error:0.000000+0.000000   test-error:0.028130+0.003084 
## [701]    train-error:0.000000+0.000000   test-error:0.027475+0.002847 
## [751]    train-error:0.000000+0.000000   test-error:0.027802+0.002877 
## [801]    train-error:0.000000+0.000000   test-error:0.028048+0.003315 
## [851]    train-error:0.000000+0.000000   test-error:0.028048+0.003193 
## [901]    train-error:0.000000+0.000000   test-error:0.027884+0.003138 
## [951]    train-error:0.000000+0.000000   test-error:0.027639+0.002643 
## [1000]   train-error:0.000000+0.000000   test-error:0.027394+0.002177

Test RMSE is a bit higher than previous. The disconnect should be explored further.

And then function is then tested on multi-class classification:

fourLocales <- c("Chicago, IL", "Las Vegas, NV", "New Orleans, LA", "San Diego, CA")

xgbFourLocales_002 <- xgbRunModel_002(filter(metarData, !is.na(TempF)), 
                                      depVar="locNamefct", 
                                      predVars=locXGBPreds, 
                                      otherVars=keepVarFull, 
                                      critFilter=list(year=2016, locNamefct=fourLocales), 
                                      seed=2008081315,
                                      nrounds=1000,
                                      print_every_n=50,
                                      xgbObjective="multi:softprob", 
                                      funcRun=xgboost::xgboost,
                                      calcErr=FALSE,
                                      num_class=length(fourLocales)
                                      )
## [1]  train-merror:0.180081 
## [51] train-merror:0.021112 
## [101]    train-merror:0.004614 
## [151]    train-merror:0.000449 
## [201]    train-merror:0.000041 
## [251]    train-merror:0.000000 
## [301]    train-merror:0.000000 
## [351]    train-merror:0.000000 
## [401]    train-merror:0.000000 
## [451]    train-merror:0.000000 
## [501]    train-merror:0.000000 
## [551]    train-merror:0.000000 
## [601]    train-merror:0.000000 
## [651]    train-merror:0.000000 
## [701]    train-merror:0.000000 
## [751]    train-merror:0.000000 
## [801]    train-merror:0.000000 
## [851]    train-merror:0.000000 
## [901]    train-merror:0.000000 
## [951]    train-merror:0.000000 
## [1000]   train-merror:0.000000
## Warning: `as_tibble.matrix()` requires a matrix with column names or a `.name_repair` argument. Using compatibility `.name_repair`.
## This warning is displayed once per session.
# Overall success
xgbFourLocales_002$testData %>%
    summarize(mean(locNamefct==predicted))
## # A tibble: 1 x 1
##   `mean(locNamefct == predicted)`
##                             <dbl>
## 1                           0.984

Overall accuracy is the same as when run using the xgbRunModel function.

Next steps are to check whether the logistic regression approach is being properly implemented in the updated function:

select(xgb_las2016_cv$xgbModel$evaluation_log, iter, test_rmse_orig=test_error_mean) %>%
    inner_join(select(xgb_las2016_cv_002$xgbModel$evaluation_log, iter, test_rmse_new=test_error_mean)) %>%
    pivot_longer(-iter, names_to="model", values_to="rmse") %>%
    filter(rmse <= 0.06) %>%
    ggplot(aes(x=iter, y=rmse, group=model, color=model)) + 
    geom_line() + 
    ylim(c(0, NA)) + 
    labs(x="# Iterations", 
         y="RMSE", 
         title="CV Test RMSE by Function", 
         subtitle="New Function vs. Original Function"
         )
## Joining, by = "iter"

RMSE is slightly higher when the data were passed to the function as 0/1 rather than having the function convert a factor. Is the same true when running the xgboost::xgboost?

Exploration shows there was a logic error in helperXGBPredict that output nonsensical probabilities. Broadly speaking, since the logit only outputs the probability of the positive class by default, creating a yes/no probability matrix from this data is both duplicative and erroneous in an unintended manner.

Function xgbRunModel() is updated to check that only a specified group of objective functions are allowed. Adding an objective function may require updating the associated helper functions. Currently allowed objective functions include:

  • reg:squarederror (default for numeric regression)
  • binary:logistic (default for 1/0 classification)
  • multi:softprob (default for multi-class classification)

Function helperXGBPredict is updated to account for the type of objective function, with binary:logistic separated so that the probability matrix is 1-predicted (probability of 0) and predicted (probability of 1). The modeling process can then be run on the Las Vegas data:

# Run the function shell
xgb_las2016_002 <- xgbRunModel_002(las2016Data, 
                                   depVar="isLAS", 
                                   predVars=locXGBPreds, 
                                   otherVars=keepVarFull, 
                                   seed=2008051405,
                                   nrounds=500,
                                   print_every_n=50, 
                                   xgbObjective="binary:logistic", 
                                   funcRun=xgboost::xgboost
                                   )
## [1]  train-error:0.096001 
## [51] train-error:0.012429 
## [101]    train-error:0.002535 
## [151]    train-error:0.000409 
## [201]    train-error:0.000000 
## [251]    train-error:0.000000 
## [301]    train-error:0.000000 
## [351]    train-error:0.000000 
## [401]    train-error:0.000000 
## [451]    train-error:0.000000 
## [500]    train-error:0.000000
## Warning: `as_tibble.matrix()` requires a matrix with column names or a `.name_repair` argument. Using compatibility `.name_repair`.
## This warning is displayed once per session.

And, the confusion matrices for the original and new approach can be compared:

# Original model
xgb_las2016$testData %>%
    count(binLAS, round(predicted))
## # A tibble: 4 x 3
##   binLAS `round(predicted)`     n
##    <dbl>              <dbl> <int>
## 1      0                  0  2587
## 2      0                  1    98
## 3      1                  0    37
## 4      1                  1  2519
# Updated model
xgb_las2016_002$testData %>%
    count(isLAS, predicted)
## # A tibble: 4 x 3
##   isLAS     predicted     n
##   <fct>     <fct>     <int>
## 1 Las Vegas Las Vegas  2519
## 2 Las Vegas All Other    37
## 3 All Other Las Vegas    98
## 4 All Other All Other  2587

Predictions for the model run with the same seed are identical. The ability to pass factor variables to binary:logistic is useful, as it returning output with the predicted class and probability of that class (such as would happen in softmax).

Suppose that instead the Las Vegas model were run directly using softprob:

# Run the function shell
xgb_las2016_002_softprob <- xgbRunModel_002(las2016Data, 
                                            depVar="isLAS", 
                                            predVars=locXGBPreds, 
                                            otherVars=keepVarFull, 
                                            seed=2008051405,
                                            nrounds=500,
                                            print_every_n=50, 
                                            xgbObjective="multi:softprob", 
                                            funcRun=xgboost::xgboost, 
                                            calcErr=FALSE,
                                            num_class=2
                                            )
## [1]  train-merror:0.095592 
## [51] train-merror:0.009567 
## [101]    train-merror:0.001554 
## [151]    train-merror:0.000082 
## [201]    train-merror:0.000000 
## [251]    train-merror:0.000000 
## [301]    train-merror:0.000000 
## [351]    train-merror:0.000000 
## [401]    train-merror:0.000000 
## [451]    train-merror:0.000000 
## [500]    train-merror:0.000000

The confusion matrix can again be checked:

# Updated model
xgb_las2016_002_softprob$testData %>%
    count(isLAS, predicted)
## # A tibble: 4 x 3
##   isLAS     predicted     n
##   <fct>     <fct>     <int>
## 1 Las Vegas Las Vegas  2519
## 2 Las Vegas All Other    37
## 3 All Other Las Vegas    89
## 4 All Other All Other  2596

Classification success is nearly identical, with just some small rounding differences.

A bigger classification is run using the four key locales plus a selection of locales that seem to map to more than one of the four key locales:

  • Chicago
  • Las Vegas
  • New Orleans
  • San Diego
  • Seattle
  • Atlanta
  • San Francisco
  • Denver
  • San Antonio
  • Tampa Bay
keyLocales <- c("Chicago, IL", "Las Vegas, NV", "New Orleans, LA", "San Diego, CA", "Seattle, WA", 
                "Atlanta, GA", "San Francisco, CA", "Denver, CO", "San Antonio, TX", "Tampa Bay, FL"
                )

xgbKeyLocales_002 <- xgbRunModel_002(metarData, 
                                     depVar="locNamefct", 
                                     predVars=locXGBPreds, 
                                     otherVars=keepVarFull, 
                                     critFilter=list(year=2016, locNamefct=keyLocales), 
                                     seed=2008111413,
                                     nrounds=1000,
                                     print_every_n=50,
                                     xgbObjective="multi:softprob", 
                                     funcRun=xgboost::xgboost,
                                     calcErr=FALSE,
                                     num_class=length(keyLocales)
                                     )
## [1]  train-merror:0.452846 
## [51] train-merror:0.113220 
## [101]    train-merror:0.045749 
## [151]    train-merror:0.021967 
## [201]    train-merror:0.011278 
## [251]    train-merror:0.005819 
## [301]    train-merror:0.002697 
## [351]    train-merror:0.001455 
## [401]    train-merror:0.000556 
## [451]    train-merror:0.000245 
## [501]    train-merror:0.000082 
## [551]    train-merror:0.000049 
## [601]    train-merror:0.000016 
## [651]    train-merror:0.000016 
## [701]    train-merror:0.000000 
## [751]    train-merror:0.000000 
## [801]    train-merror:0.000000 
## [851]    train-merror:0.000000 
## [901]    train-merror:0.000000 
## [951]    train-merror:0.000000 
## [1000]   train-merror:0.000000

Model performance can be evaluated:

# Overall success
xgbKeyLocales_002$testData %>%
    summarize(mean(locNamefct==predicted))
## # A tibble: 1 x 1
##   `mean(locNamefct == predicted)`
##                             <dbl>
## 1                           0.940
# Histogram of predicted probabilities for selected class
xgbKeyLocales_002$testData %>%
    mutate(correct=locNamefct==predicted) %>%
    ggplot() + 
    stat_count(aes(x=round(probPredicted, 2), y=..prop.., group=correct)) + 
    facet_wrap(~correct) + 
    labs(x="Probability Given to Prediction", 
         y="Proportion of Predictions", 
         title="Probability of Prediction vs. Accuracy of Prediction"
         )

# Confusion matrix
xgbKeyLocales_002$testData %>%
    count(locNamefct, predicted) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ggplot(aes(x=stringr::str_replace(predicted, pattern=", ", replacement="\n"), y=locNamefct)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) + 
    scale_fill_continuous("", low="white", high="green") + 
    labs(title="Predicted vs. Actual Locale Frequency", y="Actual Locale", x="Predicted Locale")

# Find and plot importances
xgbKeyLocales_importance_002 <- plotXGBImportance(xgbKeyLocales_002, 
                                                  featureStems=locXGBPreds, 
                                                  stemMapper = varMapper, 
                                                  plotTitle="Gain by variable in xgboost", 
                                                  plotSubtitle="2016 Locale (Subset of 10 locales modeled)"
                                                  )

The numeric variables (dew point, SLP, altimeter, temperature) dominate the variable importance (gain) in the XGB model, and the combination drives roughly 95% accuracy in predictions. There are a few clusters that drive misclassifications of over 1% - Chicago/Atlanta, New Orleans/Tampa, San Diego/San Francisco.

Predictions can then be extended to the non-modeled locales:

# Extract data for the non-modeled locales
otherLocalesData_002 <- metarData %>%
    filter(!is.na(TempF), year == 2016, !(locNamefct %in% keyLocales))

# Make the predictions
otherLocalesPred_002 <- otherLocalesData_002 %>%
    mutate_if(is.factor, .funs=fct_drop) %>%
    helperMakeSparse(predVars=locXGBPreds) %>%
    predict(xgbKeyLocales_002$xgbModel, newdata=.)

# Create the prediction matrix
otherLocalesMatrix_002 <- matrix(data=otherLocalesPred_002, 
                                 ncol=length(keyLocales), 
                                 nrow=nrow(otherLocalesData_002), 
                                 byrow=TRUE
                                 )

# Get the predictions and probabilities, and add them to non2016FourLocalesData
maxCol <- apply(otherLocalesMatrix_002, 1, FUN=which.max)
otherLocalesData_002 <- otherLocalesData_002 %>%
    mutate(predicted=xgbKeyLocales_002$yTrainLevels[maxCol], 
           probPredicted=apply(otherLocalesMatrix_002, 1, FUN=max), 
           correct=locNamefct==predicted
           )

# Confusion matrix
otherLocalesData_002 %>%
    count(locNamefct, predicted) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n), pctChi=ifelse(predicted=="Chicago, IL", pct, 0)) %>%
    ggplot(aes(x=stringr::str_replace(predicted, pattern=", ", replacement="\n"), 
               y=fct_reorder(locNamefct, pctChi, .fun=max)
               )
           ) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) + 
    scale_fill_continuous("", low="white", high="green") + 
    labs(title="Predicted vs. Actual Locale Frequency", y="Actual Locale", x="Predicted Locale")

As the number of categories has increased, so has the likelihood of a locale being classified as having meaningful probability of being in 2+ categories. Some of the larger splits seem sensible - Saint Louis/Indianapolis/Lincoln as Atlanta/Chicago, Dallas as Atlanta/Chicago/San Antonio, Newark/Philadelpia/DC as Chicago/Atlanta/Denver/Tampa, San Jose/Los Angeles as San Diego/San Francisco, Houston as Tampa/New Orleans.

Next steps are to consolidate the evaluation functions to assess model quality for both regression and classification exercises.

The evaluation process includes:

  • Assessment of variable importance from modeling - plotXGBImportance()
  • Assessment of the train (and sometimes test) error during modeling as shown in evaluation_log - plotXGBTrainEvolution() with modifications
  • Assessment of model predictions on hold-out data from the test data that was split from the training data - plotXGBTestData() should be updated for linear and confusion matrix
  • Assessment of prediction confidence vs. prediction accuracy on hold-out data
  • Assessment of model predictions on data that was not originally part of the train/test methodology

Function plotXGBImportance() is already programmed. It produces a full feature importance by every contrast of the factor and a summarized feature importance based on the stems associated with the factor:

# Find and plot importances
xgbKeyLocales_importance_002 <- plotXGBImportance(xgbKeyLocales_002, 
                                                  featureStems=locXGBPreds, 
                                                  stemMapper = varMapper, 
                                                  plotTitle="Gain by variable in xgboost", 
                                                  plotSubtitle="2016 Locale (Subset of 10 locales modeled)"
                                                  )

# Example of data available in the output list
str(xgbKeyLocales_importance_002)
## List of 2
##  $ importanceData:Classes 'data.table' and 'data.frame': 60 obs. of  4 variables:
##   ..$ Feature  : chr [1:60] "DewF" "modSLP" "Altimeter" "TempF" ...
##   ..$ Gain     : num [1:60] 0.2201 0.1677 0.144 0.1408 0.0469 ...
##   ..$ Cover    : num [1:60] 0.115 0.1808 0.1525 0.1108 0.0356 ...
##   ..$ Frequency: num [1:60] 0.146 0.194 0.139 0.137 0.098 ...
##   ..- attr(*, ".internal.selfref")=<externalptr> 
##  $ stemData      :Classes 'tbl_df', 'tbl' and 'data.frame':  10 obs. of  4 variables:
##   ..$ Feature  : chr [1:10] "month" "hrfct" "TempF" "DewF" ...
##   ..$ Gain     : num [1:10] 0.1092 0.0142 0.1408 0.2201 0.1677 ...
##   ..$ Cover    : num [1:10] 0.167 0.112 0.111 0.115 0.181 ...
##   ..$ Frequency: num [1:10] 0.0902 0.0602 0.1373 0.1457 0.1935 ...

Next, the function plotXGBTrainEvolution() is updated to plotXGBEvolution(), updated to handle both 1) regression vs. classification, and 2) train-only vs. train and test:

# Function to create and plot key metrics for an XGB model
# Currently, the R-squared capability is disabled, and this will only plot RMSE/Error
plotXGBEvolution <- function(mdl, 
                             subList="xgbModel", 
                             isRegression=TRUE,
                             isClassification=!isRegression,
                             first_iter_plot=10,
                             label_every=10, 
                             show_line=FALSE,
                             show_dashed=isClassification,
                             rounding=1, 
                             yLim=NULL
                             ) {
    
    # FUNCTION ARGUMENTS:
    # mdl: the xgb.Booster model file, or a list containing the xgb.Booster model file
    # subList: if mdl is a list, attempt to pull out item named in subList
    # isRegression: boolean, should this have regression metrics (RMSE, R2) plotted?
    # isClassification: boolean, should this have classification metrics (accuracy) plotted?
    # first_iter_plot: the first iteration to plot (for avoiding the very early burn-in errors)
    # label_every: how often to add text for the error data (e.g., 10 means print iter 10, 20, 30, etc.)
    # show_line: boolean, whether to include a line in the plot, or just the labels
    # rounding: level of precision for rounding when text is displayed on the plot
    # yLim: user-specified y-limits

    # Check that exactly one of isRegression and isClassification has been chosen
    if (!xor(isRegression, isClassification)) {
        cat("\nParemeters passed for isRegression:", isRegression, " and isClassification:", isClassification)
        stop("\nThese parameters must be logical and pass an exclusive or gate; please retry\n")
    }
    
    # Pull out the modeling data from the list if needed
    if (!("xgb.Booster" %in% class(mdl))) {
        mdl <- mdl[[subList]]
    }
    
    # Pull the error data
    errData <- mdl$evaluation_log
    
    # Convert names so that _merror becomes _error
    errNames <- names(errData)
    errNames <- str_replace(errNames, pattern="_merror", replacement="_error")
    names(errData) <- errNames
    
    # Keep only iter and then items that end in _rmse, _error, or _mean
    keepVars <- errNames[grepl(x=errNames, pattern="[iter|_rmse|_error|_mean]$")]
    errUse <- select_at(errData, vars(all_of(keepVars)))
    
    # Convert names so that train.* becomes train and test.* becomes test
    errUseNames <- names(errUse)
    errUseNames <- str_replace(errUseNames, pattern="train.*", replacement="train")
    errUseNames <- str_replace(errUseNames, pattern="test.*", replacement="test")
    names(errUse) <- errUseNames
    
    # Pivot the data longer, making the new descriptive column 'type' and the new numeric column 'error'
    errUse <- errUse %>%
        pivot_longer(-iter, names_to="type", values_to="error")
    
    # Helper function to create requested plot(s)
    helperPlotEvolution <- function(df, yVar, rnd, desc, size=3, label_every=1) {
        # Horizontal lines to be drawn at minimum by group
        groupMin <- df %>% group_by(type) %>% summarize(minError=min(error)) %>% pull(minError)
        # Create the base plot
        p1 <- df %>%
            filter(iter >= first_iter_plot) %>%
            ggplot(aes_string(x="iter", y=yVar, group="type", color="type")) + 
            labs(x="Number of iterations", 
                 y=paste0(desc), 
                 title=paste0("Evolution of ", desc)
                 ) + 
            scale_color_discrete("Data set") + 
            geom_vline(aes(xintercept=0), lty=2)
        if (show_dashed) p1 <- p1 + geom_hline(yintercept=groupMin, lty=2)
        if (!is.null(label_every)) {
            p1 <- p1 + geom_text(data=~filter(., (iter %% label_every)==0), 
                                 aes(label=round(get(yVar), rnd)), 
                                 size=size
                                 )
        }
        if (show_line) p1 <- p1 + geom_line()
        if (!is.null(yLim)) p1 <- p1 + ylim(yLim)
        print(p1)
    }
    
    # Create plot for regression (RMSE) or classification (error)
    helperPlotEvolution(errUse, 
                        yVar="error", 
                        rnd=rounding, 
                        desc=if(isRegression) "RMSE" else "Error", 
                        label_every=label_every
                        )

    # Return the relevant data file
    errUse
    
}

# Test for regression data
plotXGBEvolution(xgbInit_002, isRegression=TRUE, label_every=50, yLim=c(0, NA))

## # A tibble: 2,000 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 47.1 
##  2     2 train 33.5 
##  3     3 train 24.0 
##  4     4 train 17.6 
##  5     5 train 13.2 
##  6     6 train 10.3 
##  7     7 train  8.52
##  8     8 train  7.37
##  9     9 train  6.67
## 10    10 train  6.21
## # ... with 1,990 more rows
# Test for classification data - single-class
plotXGBEvolution(xgb_las2016, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 500 x 3
##     iter type   error
##    <dbl> <chr>  <dbl>
##  1     1 train 0.0960
##  2     2 train 0.0789
##  3     3 train 0.0681
##  4     4 train 0.0632
##  5     5 train 0.0583
##  6     6 train 0.0554
##  7     7 train 0.0514
##  8     8 train 0.0484
##  9     9 train 0.0464
## 10    10 train 0.0437
## # ... with 490 more rows
# Test for classification data - single-class with CV
plotXGBEvolution(xgb_las2016_cv, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 2,000 x 3
##     iter type   error
##    <dbl> <chr>  <dbl>
##  1     1 train 0.0895
##  2     1 test  0.101 
##  3     2 train 0.0740
##  4     2 test  0.0852
##  5     3 train 0.0684
##  6     3 test  0.0811
##  7     4 train 0.0605
##  8     4 test  0.0748
##  9     5 train 0.0579
## 10     5 test  0.0704
## # ... with 1,990 more rows
# Test for classification data - multi-class with CV
plotXGBEvolution(xgbFourLocalesCV_20152018, 
                 isRegression=FALSE, 
                 label_every=NULL, 
                 first_iter_plot = 50,
                 yLim=c(0, NA), 
                 show_line=TRUE
                 )

## # A tibble: 2,000 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 0.213
##  2     1 test  0.217
##  3     2 train 0.195
##  4     2 test  0.201
##  5     3 train 0.184
##  6     3 test  0.190
##  7     4 train 0.173
##  8     4 test  0.180
##  9     5 train 0.165
## 10     5 test  0.171
## # ... with 1,990 more rows

Next steps are to continue with evaluating model predictions on test data (RMSE or confusion, confidence level) and to extend to predictions on other data sets. Function plotXGBTestData() is updated so that it can show either the linear trend of the relationships or the confusion matrix:

# Helper function to calculate and display RMSE/R2 for continuous data
helperCalculateRMSER2 <- function(df, depVar, predVar) {
    
    df %>%
        summarize(rmse_overall=sd(get(depVar)), 
                  rmse_model=mean((get(predVar)-get(depVar))**2)**0.5
                  ) %>%
        mutate(rsq=1-rmse_model**2/rmse_overall**2)

}


# Helper function to plot predicted vs. actual for continuous data
helperPlotTestContinuous <- function(df, predVar, depVar, roundSig=0) {
    
    # Show overall model performance using rounded temperature and predictions
    p1 <- df %>%
        mutate(rndPred=round(get(predVar), roundSig), rndDep=round(get(depVar), roundSig)) %>%
        group_by_at(vars(all_of(c("rndDep", "rndPred")))) %>%
        summarize(n=n()) %>%
        ggplot(aes_string(x="rndDep", y="rndPred")) + 
        geom_point(aes(size=n), alpha=0.1) + 
        geom_smooth(aes(weight=n)) + 
        geom_abline(lty=2, color="red") + 
        labs(title="Predictions vs. actual on test dataset", y="Predicted", x="Actual")
    print(p1)

}


# Helper function to create a confusion matrix
helperConfusion <- function(df, actual, predVar="predicted", ySortVar=NULL, rotateOn=11) {

    # Number of categories that will be on the x-axis
    nX <- df %>% 
        pull(predVar) %>% 
        as.character() %>% 
        unique() %>% 
        length()
    
    # Confusion matrix
    p1 <- df %>%
        group_by_at(vars(all_of(c(actual, predVar)))) %>%
        summarize(n=n()) %>%
        group_by_at(vars(all_of(actual))) %>%
        mutate(pct=n/sum(n), 
               pctSort=if (is.null(ySortVar)) 0 else ifelse(get(predVar)==ySortVar, pct, 0)
               ) %>%
        ggplot(aes(x=if (nX < rotateOn) stringr::str_replace(get(predVar), pattern=", ", replacement="\n") else get(predVar), 
                   y=if (is.null(ySortVar)) get(actual) else fct_reorder(get(actual), pctSort, .fun=max)
                   )
               ) + 
        geom_tile(aes(fill=pct)) + 
        geom_text(aes(label=paste0(round(100*pct), "%"))) + 
        scale_fill_continuous("", low="white", high="green") + 
        labs(title="Predicted vs. Actual Frequency", y="Actual Locale", x="Predicted Locale")
    # If the rotateOn criteria is exceeded, rotate the x-axis by 90 degrees and place axes at top
    if (nX >= rotateOn) {
        p1 <- p1 + 
            theme(axis.text.x=element_text(angle=90, hjust=0.1)) + # Rotate by 90 degrees
            scale_x_discrete(position = "top") # Put at the top of the graph
    }
    # Print the plot
    print(p1)
    
}


# Updated function to report on, and plot, prediction quality
assessTestData <- function(df, 
                           depVar,
                           predVar="predicted",
                           subList="testData", 
                           isRegression=FALSE, 
                           isClassification=FALSE,
                           reportOverall=TRUE,
                           reportBy=NULL, 
                           roundSig=0,
                           showPlot=TRUE, 
                           ySortVar=NULL
                           ) {
    
    # FUNCTION ARGUMENTS:
    # df: the test data file, or a list containing the test data file
    # depVar: the variable that was predicted
    # predVar: the variable containing the prediction for depVar
    # subList: if mdl is a list, attempt to pull out item named in subList
    # reportOverall: boolean, whether to report an overall RMSE/R2 on test data
    # reportBy: variable for sumarizing RMSE/R2 by (NULL means no RMSE/R2 by any grouping variables)
    # roundSig: rounding significance for numeric plotting (points at x/y will be sized by n after rounding)
    # showPlot: boolean, whether to create/show the plot of predictions vs actuals
    # ySortVar: character, if a confusion matrix is plotted, sort y high-low by percent in this category
    
    # Pull out the modeling data from the list if needed
    if ("list" %in% class(df)) {
        df <- df[[subList]]
    }

    # Report overall RMSE/R2 if regression model and requested
    if (isRegression & reportOverall) {
        cat("\nOVERALL PREDICTIVE PERFORMANCE:\n\n")
        helperCalculateRMSER2(df, depVar=depVar, predVar=predVar) %>%
            print()
        cat("\n")
    }    

    # Report by grouping variables if any provided and regression model
    if (!is.null(reportBy) & isRegression) {
        cat("\nPREDICTIVE PERFORMANCE BY GROUP(S):\n\n")
        sapply(reportBy, FUN=function(x) { 
            df %>% 
                group_by_at(x) %>% 
                helperCalculateRMSER2(depVar=depVar, predVar=predVar) %>%
                print()
            }
        )
        cat("\n")
    }

    # Report accuracy by grouping variable and overall if reportBy and classification model
    if (!is.null(reportBy) & isClassification) {
        dfPlot <- df %>%
            mutate(isCorrect=(get(depVar)==get(predVar)))
        overallAcc <- mean(dfPlot$isCorrect)
        p1 <- dfPlot %>%
            group_by_at(vars(all_of(reportBy))) %>%
            summarize(pctCorrect=mean(isCorrect)) %>%
            ggplot(aes(x=fct_reorder(get(reportBy), pctCorrect), y=pctCorrect)) + 
            geom_col(fill="lightblue") + 
            geom_text(aes(y=pctCorrect+0.02, label=paste0(round(100*pctCorrect), "%")), hjust=0) +
            coord_flip() + 
            geom_hline(aes(yintercept=overallAcc), lty=2) + 
            annotate("text", 
                     x=2, 
                     y=overallAcc+0.02, 
                     label=paste0("Overall Accuracy:\n", round(100*overallAcc, 1), "%"), 
                     hjust=0
                     ) +
            labs(title="Accuracy of Predictions by Actual", x="Actual", y="Percent Accurately Predicted") + 
            ylim(c(0, 1.05))
        print(p1)
    }
    
    # Plot numerical summary
    if (showPlot & isRegression) {
        helperPlotTestContinuous(df, predVar=predVar, depVar=depVar, roundSig=roundSig)
    }
    
    # Plot categorical summary
    if (showPlot & isClassification) {
        helperConfusion(df, actual=depVar, predVar=predVar, ySortVar=ySortVar)
    }
    
}


# Test for regression data
assessTestData(xgbInit_002, depVar="TempF", reportBy="locNamefct", isRegression=TRUE)
## 
## OVERALL PREDICTIVE PERFORMANCE:
## 
## # A tibble: 1 x 3
##   rmse_overall rmse_model   rsq
##          <dbl>      <dbl> <dbl>
## 1         18.0       2.98 0.972
## 
## 
## PREDICTIVE PERFORMANCE BY GROUP(S):
## 
## # A tibble: 4 x 4
##   locNamefct      rmse_overall rmse_model   rsq
##   <fct>                  <dbl>      <dbl> <dbl>
## 1 Chicago, IL            21.1        3.24 0.976
## 2 Las Vegas, NV          18.3        2.75 0.977
## 3 New Orleans, LA        13.3        3.03 0.948
## 4 San Diego, CA           7.37       2.89 0.846
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

# Test for classification data - single-class
assessTestData(xgb_las2016_002, depVar="isLAS", isClassification=TRUE)

# Test for classification data - multi-class
assessTestData(xgbFourLocales_002, depVar="locNamefct", isClassification=TRUE)

Next, a function is written to assess prediction quality by confidence (applicable only for classification):

# Function to plot predictuon accuracy vs. prediction confidence
plotPredictionConfidencevQuality <- function(df, 
                                             depVar,
                                             predVar="predicted", 
                                             probVar="probPredicted", 
                                             subList="testData", 
                                             roundProb=0.05, 
                                             dataLim=1
                                             ) {
    
    # FUNCTION ARGUMENTS:
    # df: the test data file, or a list containing the test data file
    # depVar: the variable that was predicted
    # predVar: the variable containing the prediction for depVar
    # probVar: the variable containing the probability associated to the prediction
    # subList: if mdl is a list, attempt to pull out item named in subList
    # roundProb: round predicted probabilities to the nearest this
    # dataLim: only report for points with at least this quantity of data
    
    # Pull out the modeling data from the list if needed
    if ("list" %in% class(df)) {
        df <- df[[subList]]
    }
    
    # Pull a main plotting database together
    plotData <- df %>%
        mutate(prob=round(get(probVar)/roundProb)*roundProb, 
               correct=get(depVar)==get(predVar)
               ) %>%
        count(prob, correct) %>%
        mutate(nCorrect=ifelse(correct, n, 0)) %>%
        group_by(prob) %>%
        summarize(n=sum(n), nCorrect=sum(nCorrect)) %>%
        mutate(pctCorrect=nCorrect/n) %>%
        ungroup()
    
    # Create overall plot of accuracy by predicted probability
    p1 <- plotData %>%
        filter(n >= dataLim) %>%
        ggplot(aes(x=prob, y=pctCorrect)) + 
        geom_point(aes(size=n)) + 
        geom_text(aes(y=pctCorrect-0.04, label=paste0(round(100*pctCorrect, 1), "%\n(n=", n, ")"))) + 
        geom_abline(lty=2) + 
        labs(x="Probability Predicted", y="Percent Correct", title="Accuracy vs. Probability Predicted")
    print(p1)
    
}

# Assess for classification data - single-class
plotPredictionConfidencevQuality(xgb_las2016_002, depVar="isLAS")

# Assess for classification data - multi-class
plotPredictionConfidencevQuality(xgbFourLocales_002, depVar="locNamefct", dataLim=5)

Functions can then be written to assess model quality on data that is not yet part of the test dataset. This requires as inputs a trained model and a dataset that can be converted to the same variables as the trained model. The process is currently implemented only for classification:

# Function to apply predictions to other data
assessNonModelDataPredictions <- function(mdl, 
                                          df, 
                                          depVar,
                                          predVars, 
                                          yLevels, 
                                          ySortVar=NULL, 
                                          diagnose=FALSE
                                          ) {
    
    # FUNCTION ARGUMENTS:
    # mdl: the trained model
    # df: the data file for predictions
    # depVar: the actual category as contained in df
    # predVars: the variable containing the prediction for depVar
    # yLevels: the levels that the prediction can be made to (sort order is important)
    # ySortVar: character, if a confusion matrix is plotted, sort y high-low by percent in this category    
    #           NULL means no factor reordering, plot as-is
    # diagnose: boolean, whether to print mdl and dfSparse to help with debugging
    
    # Create the sparse data
    dfSparse <- df %>%
        mutate_if(is.factor, .funs=fct_drop) %>%
        helperMakeSparse(predVars=predVars)
    
    # Debugging, if needed
    if (diagnose) {
        print(mdl$feature_names)
        print(colnames(dfSparse))
    }
    
    # Create the predictions
    predsList <- helperXGBPredict(mdl=mdl, 
                                  dfSparse=dfSparse, 
                                  objective=mdl$params$objective, 
                                  probMatrix=TRUE, 
                                  yLevels=yLevels
                                  )
    
    # Add the predictions and associated probabilities to df
    df <- df %>%
        mutate(predicted=predsList$predData$predicted, 
               probPredicted=predsList$predData$probPredicted
               )
    
    # Plot the confusion matrix
    helperConfusion(df, actual=depVar, predVar="predicted", ySortVar=ySortVar)
    
    # Return the data frame
    df
    
}

# Assess for classification data - single-class
df_las2016_002 <- assessNonModelDataPredictions(mdl=xgb_las2016_002$xgbModel, 
                                                df=filter(metarData, !is.na(TempF), year!=2016), 
                                                depVar="locNamefct", 
                                                predVars=locXGBPreds, 
                                                yLevels=xgb_las2016_002$yTrainLevels, 
                                                ySortVar="Las Vegas"
                                                )
## Warning: `as_tibble.matrix()` requires a matrix with column names or a `.name_repair` argument. Using compatibility `.name_repair`.
## This warning is displayed once per session.

# Assess for classification data - multi-class
df_FourLocales_002 <- assessNonModelDataPredictions(mdl=xgbFourLocales_002$xgbModel, 
                                                    df=filter(metarData, !is.na(TempF), year==2016), 
                                                    depVar="locNamefct", 
                                                    predVars=locXGBPreds, 
                                                    yLevels=xgbFourLocales_002$yTrainLevels, 
                                                    ySortVar="Chicago, IL"
                                                    )

Next steps are to better organize the functions so the process can be run from start to finish for a specific data set and modeling technique.

Suppose for example that the goal is to predict the dew point based on the other information in a METAR file. The function xgbRunModel_002() will achieve this by integrating all of the prediction functions:

# Define key predictor variables for base XGB runs
dewXGBPreds <- c("locNamefct", "month", "hrfct", 
                 "TempF", "modSLP", "Altimeter", "WindSpeed", 
                 "predomDir", "minHeight", "ceilingHeight"
                 )

# Core multi-year cities
multiYearLocales <- c("Las Vegas, NV", "New Orleans, LA", "Chicago, IL", "San Diego, CA")

# Run the function shell
xgb_dewpoint_cv <- xgbRunModel_002(metarData, 
                                   depVar="DewF", 
                                   predVars=dewXGBPreds, 
                                   otherVars=keepVarFull, 
                                   critFilter=list(locNamefct=multiYearLocales),
                                   critFilterNot=list(year=2016),
                                   seed=2008141315,
                                   nrounds=500,
                                   print_every_n=50, 
                                   xgbObjective="reg:squarederror", 
                                   funcRun=xgboost::xgb.cv, 
                                   nfold=5
                                   )
## [1]  train-rmse:34.908990+0.021015   test-rmse:34.911011+0.093105 
## [51] train-rmse:5.748802+0.020406    test-rmse:6.015603+0.032112 
## [101]    train-rmse:5.405655+0.015986    test-rmse:5.855463+0.022648 
## [151]    train-rmse:5.166692+0.011762    test-rmse:5.787196+0.021932 
## [201]    train-rmse:4.975575+0.012507    test-rmse:5.750390+0.024581 
## [251]    train-rmse:4.815161+0.015122    test-rmse:5.730521+0.017267 
## [301]    train-rmse:4.681333+0.016265    test-rmse:5.720970+0.015350 
## [351]    train-rmse:4.550474+0.015954    test-rmse:5.712879+0.016396 
## [401]    train-rmse:4.433186+0.016231    test-rmse:5.711841+0.015496 
## [451]    train-rmse:4.325898+0.014494    test-rmse:5.710880+0.016200 
## [500]    train-rmse:4.227323+0.017815    test-rmse:5.710621+0.017307

An assesment of the train-test error can be performed:

plotXGBEvolution(xgb_dewpoint_cv, isRegression=TRUE, label_every=25, yLim=c(0, NA), show_line=FALSE)

## # A tibble: 1,000 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train  34.9
##  2     1 test   34.9
##  3     2 train  25.0
##  4     2 test   25.1
##  5     3 train  18.3
##  6     3 test   18.3
##  7     4 train  13.8
##  8     4 test   13.9
##  9     5 train  10.9
## 10     5 test   11.0
## # ... with 990 more rows

Test error is fairly stable after 200 rounds, at just under 6 degrees. The full model can then be run using 200 rounds:

# Run the function shell
xgb_dewpoint <- xgbRunModel_002(metarData, 
                                depVar="DewF", 
                                predVars=dewXGBPreds, 
                                otherVars=keepVarFull, 
                                critFilter=list(locNamefct=multiYearLocales),
                                critFilterNot=list(year=2016),
                                seed=2008141325,
                                nrounds=200,
                                print_every_n=50, 
                                xgbObjective="reg:squarederror", 
                                funcRun=xgboost::xgboost
                                )
## [1]  train-rmse:34.914940 
## [51] train-rmse:5.777020 
## [101]    train-rmse:5.456133 
## [151]    train-rmse:5.245913 
## [200]    train-rmse:5.076818

The full series of assessment functions can then be run:

# ASSESSMENT 1: Variable Importance
xgb_dewpoint_importance <- plotXGBImportance(xgb_dewpoint, 
                                             featureStems=dewXGBPreds, 
                                             stemMapper = varMapper, 
                                             plotTitle="Gain by variable in xgboost", 
                                             plotSubtitle="Dewpoint (four key locales modeled)"
                                             )

# ASSESSMENT 2: Evolution of training error
plotXGBEvolution(xgb_dewpoint, isRegression=TRUE, label_every=10, yLim=c(0, NA), show_line=FALSE)

## # A tibble: 200 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 34.9 
##  2     2 train 25.1 
##  3     3 train 18.3 
##  4     4 train 13.9 
##  5     5 train 11.0 
##  6     6 train  9.15
##  7     7 train  8.05
##  8     8 train  7.43
##  9     9 train  7.05
## 10    10 train  6.82
## # ... with 190 more rows
# ASSESSMENT 3: Assess performance on test dataset
assessTestData(xgb_dewpoint, depVar="DewF", reportBy="locNamefct", isRegression=TRUE)
## 
## OVERALL PREDICTIVE PERFORMANCE:
## 
## # A tibble: 1 x 3
##   rmse_overall rmse_model   rsq
##          <dbl>      <dbl> <dbl>
## 1         19.2       5.73 0.911
## 
## 
## PREDICTIVE PERFORMANCE BY GROUP(S):
## 
## # A tibble: 4 x 4
##   locNamefct      rmse_overall rmse_model   rsq
##   <fct>                  <dbl>      <dbl> <dbl>
## 1 Chicago, IL             19.8       4.03 0.959
## 2 Las Vegas, NV           13.9       8.64 0.611
## 3 New Orleans, LA         14.3       3.80 0.930
## 4 San Diego, CA           10.5       5.10 0.762
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

# ASSESSMENT 4: Prediction quality vs. confidence (currently only implemented for classification)

# ASSESSMENT 5: Performance on other data (currently only implemented for classification)

The model primarily estimates dewpoint based on temperature and locale. Predictions are most accurate for Chicago and New Orleans, and least accurate for Las Vegas. There does not appear to be much learning that can generalize to unseen data after the first hundred rounds.

The process is repeated, with an attempt made to model the ceiling height for the four key locales. The minimum cloud height is included, which may be a leak:

# Define key predictor variables for base XGB runs
ceilXGBPreds <- c("locNamefct", "month", "hrfct", 
                  "TempF", "DewF", "modSLP", "Altimeter", "WindSpeed", 
                  "predomDir", "minHeight"
                  )

# Core multi-year cities
multiYearLocales <- c("Las Vegas, NV", "New Orleans, LA", "Chicago, IL", "San Diego, CA")

# Ceiling types
ceilTypes <- sort(as.character(unique(metarData$ceilingHeight)))
ceilTypes
## [1] "High"    "Low"     "Medium"  "None"    "Surface"
# Run the function shell
xgb_ceiling_cv <- xgbRunModel_002(metarData, 
                                  depVar="ceilingHeight", 
                                  predVars=ceilXGBPreds, 
                                  otherVars=keepVarFull, 
                                  critFilter=list(locNamefct=multiYearLocales),
                                  critFilterNot=list(year=2016),
                                  seed=2008141340,
                                  nrounds=500,
                                  print_every_n=50, 
                                  xgbObjective="multi:softprob", 
                                  funcRun=xgboost::xgb.cv, 
                                  nfold=5, 
                                  num_class=length(ceilTypes)
                                  )
## [1]  train-merror:0.208730+0.001512  test-merror:0.212616+0.002543 
## [51] train-merror:0.164895+0.000666  test-merror:0.182936+0.003078 
## [101]    train-merror:0.142586+0.000619  test-merror:0.177912+0.002985 
## [151]    train-merror:0.125482+0.000916  test-merror:0.176978+0.002699 
## [201]    train-merror:0.110178+0.001113  test-merror:0.175823+0.003838 
## [251]    train-merror:0.098015+0.000810  test-merror:0.175757+0.003341 
## [301]    train-merror:0.086571+0.000563  test-merror:0.175946+0.002985 
## [351]    train-merror:0.076183+0.000455  test-merror:0.175888+0.003238 
## [401]    train-merror:0.066773+0.000777  test-merror:0.176511+0.003567 
## [451]    train-merror:0.058898+0.001198  test-merror:0.176609+0.003041 
## [500]    train-merror:0.051704+0.000581  test-merror:0.176413+0.003466

An assesment of the train-test error can be performed:

plotXGBEvolution(xgb_ceiling_cv, isRegression=FALSE, yLim=c(0, NA), show_line=TRUE, label_every=NULL)

## # A tibble: 1,000 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 0.209
##  2     1 test  0.213
##  3     2 train 0.203
##  4     2 test  0.207
##  5     3 train 0.201
##  6     3 test  0.205
##  7     4 train 0.199
##  8     4 test  0.203
##  9     5 train 0.197
## 10     5 test  0.201
## # ... with 990 more rows

Test error is fairly stable after 200 rounds, at just under 20%. The full model is run using 200 rounds:

# Run the function shell
xgb_ceiling <- xgbRunModel_002(metarData, 
                               depVar="ceilingHeight", 
                               predVars=ceilXGBPreds, 
                               otherVars=keepVarFull, 
                               critFilter=list(locNamefct=multiYearLocales),
                               critFilterNot=list(year=2016),
                               seed=2008141350,
                               nrounds=200,
                               print_every_n=50, 
                               xgbObjective="multi:softprob", 
                               funcRun=xgboost::xgboost, 
                               num_class=length(ceilTypes)
                               )
## [1]  train-merror:0.205897 
## [51] train-merror:0.167661 
## [101]    train-merror:0.147551 
## [151]    train-merror:0.133145 
## [200]    train-merror:0.120460

The full series of assessment functions can then be run:

# ASSESSMENT 1: Variable Importance
xgb_ceiling_importance <- plotXGBImportance(xgb_ceiling, 
                                            featureStems=ceilXGBPreds, 
                                            stemMapper = varMapper, 
                                            plotTitle="Gain by variable in xgboost", 
                                            plotSubtitle="Ceiling Height (four key locales modeled)"
                                            )

# ASSESSMENT 2: Evolution of training error
plotXGBEvolution(xgb_ceiling, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 200 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 0.206
##  2     2 train 0.201
##  3     3 train 0.199
##  4     4 train 0.198
##  5     5 train 0.196
##  6     6 train 0.195
##  7     7 train 0.193
##  8     8 train 0.191
##  9     9 train 0.190
## 10    10 train 0.189
## # ... with 190 more rows
# ASSESSMENT 3: Assess performance on test dataset
assessTestData(xgb_ceiling, depVar="ceilingHeight", isClassification=TRUE)

# ASSESSMENT 4: Prediction quality vs. confidence (currently only implemented for classification)
plotPredictionConfidencevQuality(xgb_ceiling, depVar="ceilingHeight", dataLim=5)

# ASSESSMENT 5: Performance on other data (currently only implemented for classification)
df_ceiling_pred <- assessNonModelDataPredictions(mdl=xgb_ceiling$xgbModel, 
                                                 df=filter(metarData, 
                                                           !is.na(TempF), 
                                                           year==2016, 
                                                           locNamefct %in% fourLocales
                                                           ), 
                                                 depVar="ceilingHeight", 
                                                 predVars=ceilXGBPreds, 
                                                 yLevels=xgb_ceiling$yTrainLevels, 
                                                 ySortVar="None"
                                                 )

# Additional assessments
xgb_ceiling$testData %>%
    mutate(correct=(ceilingHeight==predicted)) %>%
    count(ceilingHeight, correct) %>%
    ggplot(aes(x=ceilingHeight, y=n, fill=correct)) + 
    geom_col(position="stack")

xgb_ceiling$testData %>%
    mutate(correct=(ceilingHeight==predicted)) %>%
    count(ceilingHeight, correct) %>%
    ggplot(aes(x=ceilingHeight, y=n, fill=correct)) + 
    geom_col(position="fill")

xgb_ceiling$testData %>%
    mutate(correct=(ceilingHeight==predicted)) %>%
    count(minHeight, correct) %>%
    ggplot(aes(x=minHeight, y=n, fill=correct)) + 
    geom_col(position="stack")

xgb_ceiling$testData %>%
    mutate(correct=(ceilingHeight==predicted)) %>%
    count(minHeight, correct) %>%
    ggplot(aes(x=minHeight, y=n, fill=correct)) + 
    geom_col(position="fill")

xgb_ceiling$testData %>%
    mutate(correct=(ceilingHeight==predicted)) %>%
    count(minHeight, predicted) %>%
    group_by(minHeight) %>%
    mutate(pct=n/sum(n)) %>%
    ggplot(aes(x=minHeight, y=predicted)) + 
    geom_tile(aes(fill=n)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) + 
    scale_fill_continuous(low="white", high="green")

The model prediction probabilities are well aligned with the model percent of correct predictions. Minimum cloud height is, as expected, a key driver of ceiling height, followed by temperature, dewpoint, and locale.

Due to significant class imbalance, overall prediction accuracy is high even though prediction accuracy is poor for cases where the ceiling height is medium or high. The model generally does well in classifying Surface ceilings and None (cases of no ceilings). This is driven by the somewhat trivial prediction that there is no ceiling when there are no clouds. Prediction accuracies are about 75% in the other cases where there are at least some clouds. In general, the model will either predict minHeight or None as the ceiling.

Next, the process is run to classify each of the locales in the 2016 data:

# Extract the locales that are available in 2016
locs2016 <- metarData %>%
    filter(year==2016) %>%
    pull(locNamefct) %>%
    as.character() %>%
    unique() %>%
    sort()

# Run the function shell for CV
xgb_alllocales_cv <- xgbRunModel_002(metarData, 
                                     depVar="locNamefct", 
                                     predVars=locXGBPreds, 
                                     otherVars=keepVarFull, 
                                     critFilter=list(year=2016),
                                     seed=2008151247,
                                     nrounds=250,
                                     print_every_n=10, 
                                     xgbObjective="multi:softmax", 
                                     funcRun=xgboost::xgb.cv, 
                                     nfold=5, 
                                     num_class=length(locs2016)
                                     )
## [1]  train-merror:0.745627+0.001385  test-merror:0.760504+0.002577 
## [11] train-merror:0.617385+0.000686  test-merror:0.658066+0.001841 
## [21] train-merror:0.560972+0.001412  test-merror:0.618157+0.001144 
## [31] train-merror:0.514607+0.001192  test-merror:0.587029+0.000623 
## [41] train-merror:0.475156+0.001908  test-merror:0.562757+0.001588 
## [51] train-merror:0.443247+0.001347  test-merror:0.544287+0.001805 
## [61] train-merror:0.415363+0.001760  test-merror:0.529081+0.001766 
## [71] train-merror:0.390475+0.001544  test-merror:0.515730+0.002567 
## [81] train-merror:0.369088+0.001306  test-merror:0.505884+0.003183 
## [91] train-merror:0.349051+0.001423  test-merror:0.496971+0.003049 
## [101]    train-merror:0.330882+0.001623  test-merror:0.489444+0.002999 
## [111]    train-merror:0.314679+0.001510  test-merror:0.482507+0.002212 
## [121]    train-merror:0.299173+0.001012  test-merror:0.476148+0.002386 
## [131]    train-merror:0.285131+0.001122  test-merror:0.470548+0.002154 
## [141]    train-merror:0.271809+0.001310  test-merror:0.466558+0.001910 
## [151]    train-merror:0.259201+0.001448  test-merror:0.462459+0.001562 
## [161]    train-merror:0.247159+0.001442  test-merror:0.458628+0.001773 
## [171]    train-merror:0.235932+0.001559  test-merror:0.455020+0.001292 
## [181]    train-merror:0.225655+0.001483  test-merror:0.451456+0.001546 
## [191]    train-merror:0.215591+0.001664  test-merror:0.448148+0.001480 
## [201]    train-merror:0.205799+0.001741  test-merror:0.444857+0.002074 
## [211]    train-merror:0.196784+0.001426  test-merror:0.442117+0.001527 
## [221]    train-merror:0.188253+0.001439  test-merror:0.439731+0.001641 
## [231]    train-merror:0.180301+0.001789  test-merror:0.436762+0.001768 
## [241]    train-merror:0.172243+0.001413  test-merror:0.434786+0.001619 
## [250]    train-merror:0.165748+0.001590  test-merror:0.432473+0.001415
# Plot the error evolution
plotXGBEvolution(xgb_alllocales_cv, 
                 isRegression=FALSE, 
                 yLim=c(0, NA), 
                 show_line=TRUE, 
                 label_every=NULL, 
                 first_iter_plot = 1
                 )

## # A tibble: 500 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 0.746
##  2     1 test  0.761
##  3     2 train 0.712
##  4     2 test  0.730
##  5     3 train 0.692
##  6     3 test  0.714
##  7     4 train 0.678
##  8     4 test  0.703
##  9     5 train 0.666
## 10     5 test  0.694
## # ... with 490 more rows

The function can then be run for 250 rounds:

# Run the function shell
xgb_alllocales <- xgbRunModel_002(metarData, 
                                  depVar="locNamefct", 
                                  predVars=locXGBPreds, 
                                  otherVars=keepVarFull, 
                                  critFilter=list(year=2016),
                                  seed=2008151315,
                                  nrounds=250,
                                  print_every_n=10, 
                                  xgbObjective="multi:softprob", 
                                  funcRun=xgboost::xgboost, 
                                  num_class=length(locs2016)
                                  )
## [1]  train-merror:0.743338 
## [11] train-merror:0.624117 
## [21] train-merror:0.570813 
## [31] train-merror:0.523857 
## [41] train-merror:0.485858 
## [51] train-merror:0.455118 
## [61] train-merror:0.428406 
## [71] train-merror:0.405127 
## [81] train-merror:0.385849 
## [91] train-merror:0.366451 
## [101]    train-merror:0.349280 
## [111]    train-merror:0.333621 
## [121]    train-merror:0.319086 
## [131]    train-merror:0.306117 
## [141]    train-merror:0.294071 
## [151]    train-merror:0.282233 
## [161]    train-merror:0.271005 
## [171]    train-merror:0.260160 
## [181]    train-merror:0.250112 
## [191]    train-merror:0.240446 
## [201]    train-merror:0.231571 
## [211]    train-merror:0.222396 
## [221]    train-merror:0.214176 
## [231]    train-merror:0.205896 
## [241]    train-merror:0.198020 
## [250]    train-merror:0.191677

And the evaluation functions can then be run:

# ASSESSMENT 1: Variable Importance
xgb_alllocales_importance <- plotXGBImportance(xgb_alllocales, 
                                               featureStems=locXGBPreds, 
                                               stemMapper = varMapper, 
                                               plotTitle="Gain by variable in xgboost", 
                                               plotSubtitle="Locale (2016)"
                                               )

# ASSESSMENT 2: Evolution of training error
plotXGBEvolution(xgb_alllocales, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 250 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 0.743
##  2     2 train 0.713
##  3     3 train 0.698
##  4     4 train 0.683
##  5     5 train 0.671
##  6     6 train 0.662
##  7     7 train 0.654
##  8     8 train 0.647
##  9     9 train 0.639
## 10    10 train 0.631
## # ... with 240 more rows
# ASSESSMENT 3: Assess performance on test dataset
assessTestData(xgb_alllocales, depVar="locNamefct", reportBy="locNamefct", isClassification=TRUE)

# ASSESSMENT 4: Prediction quality vs. confidence (currently only implemented for classification)
plotPredictionConfidencevQuality(xgb_alllocales, depVar="locNamefct", dataLim=5)

# ASSESSMENT 5: Performance on other data (currently only implemented for classification)
df_alllocales_pred <- assessNonModelDataPredictions(mdl=xgb_alllocales$xgbModel, 
                                                    df=filter(metarData, 
                                                              !is.na(TempF), 
                                                              year!=2016, 
                                                              locNamefct %in% fourLocales
                                                              ), 
                                                    depVar="locNamefct", 
                                                    predVars=locXGBPreds, 
                                                    yLevels=xgb_alllocales$yTrainLevels, 
                                                    ySortVar="None"
                                                    )

The model is reasonably effective at classifying many of the more “distinct” locales, but struggles with the cold weather locales like Chicago. Next steps are to build on the previous one vs. all approach to see if that can speed up the process or enhance the predictive power (overall or for trickier locales).

Two approaches are taken to model individual locales in 2016:

  • Locale vs. all-other, with under-sampling of all-other so that classes are balanced
  • Locale vs. locale for a specific city pairing

Test data can then be run through the locale vs. all-other algorithms in an attempt at classification. Close calls can be further run through the relevant locale vs. locale classification. Example code includes:

# Create the overall test and train data
fullDataSplit <- createTestTrain(filter(metarData, year==2016, !is.na(TempF)), 
                                 noNA=FALSE, 
                                 seed=2008161312
                                 )

# Extract the training data for the remainder of the process
localeTrainData <- fullDataSplit$trainData

The models for one vs. all can then be run and cached:

# Create the locations to be modeled
useLocs <- locs2016

# Create a container for storing all relevant objects
localeOnevAll <- vector("list", length(useLocs))

# Run the modeling process once for each locale
n <- 1
for (thisLoc in useLocs) {
    
    # Announce the progress
    cat("\nProcess for:", thisLoc, "\n")
    
    # Create the training data to use for the model
    thisTrain <- localeTrainData %>%
        mutate(curLocale=factor(ifelse(locNamefct==thisLoc, thisLoc, "All Other"), 
                                levels=c(thisLoc, "All Other")
                                )
               )
    
    # Find the samllest group size
    smallN <- thisTrain %>%
        count(curLocale) %>%
        pull(n) %>%
        min()
    
    # Balance the samples in thisTrain
    thisTrain <- thisTrain %>%
        group_by(curLocale) %>%
        sample_n(smallN) %>%
        ungroup()
    
    # Run the CV process with a callback for early stopping if 5 iterations show no improvement
    xgb_thislocale_cv <- xgbRunModel_002(thisTrain, 
                                         depVar="curLocale", 
                                         predVars=locXGBPreds, 
                                         otherVars=keepVarFull, 
                                         seed=2008161330+n,
                                         nrounds=500,
                                         print_every_n=500, 
                                         xgbObjective="multi:softmax", 
                                         funcRun=xgboost::xgb.cv, 
                                         nfold=5, 
                                         num_class=2, 
                                         early_stopping_rounds=5
                                         )

    # The best iteration can then be pulled
    bestN <- xgb_thislocale_cv$xgbModel$best_iteration
    
    # And the model can be run for that number of iterations
    xgb_thislocale <- xgbRunModel_002(thisTrain, 
                                      depVar="curLocale", 
                                      predVars=locXGBPreds, 
                                      otherVars=keepVarFull, 
                                      seed=2008161330+100+n,
                                      nrounds=bestN,
                                      print_every_n=500, 
                                      xgbObjective="multi:softprob", 
                                      funcRun=xgboost::xgboost, 
                                      num_class=2
                                      )
    
    
    # Place the trained CV object in the relevant list
    localeOnevAll[[n]] <- list(cvResult=xgb_thislocale_cv, 
                               mdlResult=xgb_thislocale, 
                               bestN=bestN
                               )
    
    # Increment the counter
    n <- n + 1
    
}
## 
## Process for: Atlanta, GA 
## [1]  train-merror:0.296777+0.005753  test-merror:0.317279+0.005786 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.046547+0.001924  test-merror:0.156963+0.008502
## 
## [1]  train-merror:0.295680 
## [61] train-merror:0.054054 
## 
## Process for: Boston, MA 
## [1]  train-merror:0.307703+0.012094  test-merror:0.343419+0.013340 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [120]    train-merror:0.019682+0.001777  test-merror:0.168787+0.007849
## 
## [1]  train-merror:0.320406 
## [120]    train-merror:0.034575 
## 
## Process for: Chicago, IL 
## [1]  train-merror:0.379727+0.010129  test-merror:0.403161+0.012830 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [44] train-merror:0.119351+0.002133  test-merror:0.264470+0.006281
## 
## [1]  train-merror:0.380101 
## [44] train-merror:0.130274 
## 
## Process for: Dallas, TX 
## [1]  train-merror:0.276898+0.002016  test-merror:0.298793+0.015257 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.050053+0.003491  test-merror:0.144541+0.012501
## 
## [1]  train-merror:0.275056 
## [49] train-merror:0.067828 
## 
## Process for: Denver, CO 
## [1]  train-merror:0.153111+0.005850  test-merror:0.164679+0.005311 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.007260+0.000315  test-merror:0.041554+0.004322
## 
## [1]  train-merror:0.165270 
## [49] train-merror:0.009916 
## 
## Process for: Detroit, MI 
## [1]  train-merror:0.369372+0.009312  test-merror:0.402094+0.009898 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [36] train-merror:0.133828+0.004351  test-merror:0.254799+0.010743
## 
## [1]  train-merror:0.375335 
## [36] train-merror:0.145201 
## 
## Process for: Grand Rapids, MI 
## [1]  train-merror:0.361565+0.004385  test-merror:0.389953+0.008254 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [33] train-merror:0.132126+0.003853  test-merror:0.248364+0.010590
## 
## [1]  train-merror:0.369860 
## [33] train-merror:0.146028 
## 
## Process for: Green Bay, WI 
## [1]  train-merror:0.272149+0.006305  test-merror:0.294385+0.010471 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [46] train-merror:0.078079+0.002759  test-merror:0.194539+0.005848
## 
## [1]  train-merror:0.302707 
## [46] train-merror:0.087308 
## 
## Process for: Houston, TX 
## [1]  train-merror:0.207666+0.005346  test-merror:0.224158+0.007253 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [73] train-merror:0.031881+0.003876  test-merror:0.126829+0.002340
## 
## [1]  train-merror:0.200813 
## [73] train-merror:0.040070 
## 
## Process for: Indianapolis, IN 
## [1]  train-merror:0.384455+0.008535  test-merror:0.406197+0.012957 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [37] train-merror:0.147467+0.005538  test-merror:0.268561+0.013552
## 
## [1]  train-merror:0.373121 
## [37] train-merror:0.138214 
## 
## Process for: Las Vegas, NV 
## [1]  train-merror:0.102757+0.003954  test-merror:0.118241+0.009067 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [38] train-merror:0.018768+0.000720  test-merror:0.060175+0.005514
## 
## [1]  train-merror:0.104868 
## [38] train-merror:0.022757 
## 
## Process for: Lincoln, NE 
## [1]  train-merror:0.278283+0.007193  test-merror:0.306160+0.011818 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.050735+0.001483  test-merror:0.192324+0.006210
## 
## [1]  train-merror:0.287614 
## [66] train-merror:0.067063 
## 
## Process for: Los Angeles, CA 
## [1]  train-merror:0.173261+0.002408  test-merror:0.190622+0.008925 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [73] train-merror:0.020196+0.001783  test-merror:0.084801+0.005638
## 
## [1]  train-merror:0.172198 
## [73] train-merror:0.030471 
## 
## Process for: Madison, WI 
## [1]  train-merror:0.351740+0.010004  test-merror:0.367496+0.002700 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [40] train-merror:0.106493+0.002249  test-merror:0.235950+0.012628
## 
## [1]  train-merror:0.359962 
## [40] train-merror:0.122100 
## 
## Process for: Miami, FL 
## [1]  train-merror:0.112409+0.004302  test-merror:0.124898+0.005241 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [68] train-merror:0.010224+0.000474  test-merror:0.054607+0.004308
## 
## [1]  train-merror:0.115487 
## [68] train-merror:0.011851 
## 
## Process for: Milwaukee, WI 
## [1]  train-merror:0.356984+0.014230  test-merror:0.372066+0.017789 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [48] train-merror:0.108187+0.001064  test-merror:0.249648+0.004762
## 
## [1]  train-merror:0.376526 
## [48] train-merror:0.125117 
## 
## Process for: Minneapolis, MN 
## [1]  train-merror:0.341443+0.010364  test-merror:0.359642+0.009594 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [34] train-merror:0.116466+0.003693  test-merror:0.219045+0.003888
## 
## [1]  train-merror:0.350035 
## [34] train-merror:0.128327 
## 
## Process for: New Orleans, LA 
## [1]  train-merror:0.199070+0.005656  test-merror:0.214848+0.009807 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [112]    train-merror:0.001171+0.000207  test-merror:0.053857+0.005244
## 
## [1]  train-merror:0.202201 
## [112]    train-merror:0.003044 
## 
## Process for: Newark, NJ 
## [1]  train-merror:0.338524+0.007223  test-merror:0.372730+0.006927 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [69] train-merror:0.065120+0.002231  test-merror:0.202493+0.012154
## 
## [1]  train-merror:0.338496 
## [69] train-merror:0.080578 
## 
## Process for: Philadelphia, PA 
## [1]  train-merror:0.344060+0.006896  test-merror:0.368407+0.007542 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [86] train-merror:0.049690+0.001082  test-merror:0.207060+0.007782
## 
## [1]  train-merror:0.366187 
## [86] train-merror:0.066760 
## 
## Process for: Phoenix, AZ 
## [1]  train-merror:0.106887+0.006681  test-merror:0.125866+0.006013 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [52] train-merror:0.009842+0.000264  test-merror:0.051828+0.002367
## 
## [1]  train-merror:0.100717 
## [52] train-merror:0.011987 
## 
## Process for: Saint Louis, MO 
## [1]  train-merror:0.358889+0.009795  test-merror:0.397208+0.008024 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [77] train-merror:0.046705+0.000746  test-merror:0.200071+0.005146
## 
## [1]  train-merror:0.377507 
## [77] train-merror:0.057934 
## 
## Process for: San Antonio, TX 
## [1]  train-merror:0.243063+0.006245  test-merror:0.258708+0.012210 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [81] train-merror:0.000738+0.000324  test-merror:0.036132+0.004359
## 
## [1]  train-merror:0.233322 
## [81] train-merror:0.000827 
## 
## Process for: San Diego, CA 
## [1]  train-merror:0.142786+0.002565  test-merror:0.156132+0.009671 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.010352+0.000812  test-merror:0.061117+0.005535
## 
## [1]  train-merror:0.142991 
## [72] train-merror:0.012903 
## 
## Process for: San Francisco, CA 
## [1]  train-merror:0.147587+0.004021  test-merror:0.162684+0.009303 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [51] train-merror:0.032637+0.001836  test-merror:0.092172+0.007108
## 
## [1]  train-merror:0.143614 
## [51] train-merror:0.034020 
## 
## Process for: San Jose, CA 
## [1]  train-merror:0.165947+0.005089  test-merror:0.175980+0.015322 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [45] train-merror:0.047088+0.002797  test-merror:0.100009+0.011056
## 
## [1]  train-merror:0.179834 
## [45] train-merror:0.051348 
## 
## Process for: Seattle, WA 
## [1]  train-merror:0.208717+0.006803  test-merror:0.226163+0.009529 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.007206+0.000764  test-merror:0.055432+0.006443
## 
## [1]  train-merror:0.212277 
## [66] train-merror:0.007702 
## 
## Process for: Tampa Bay, FL 
## [1]  train-merror:0.178346+0.005434  test-merror:0.196844+0.007991 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.026622+0.002298  test-merror:0.087668+0.009293
## 
## [1]  train-merror:0.175570 
## [60] train-merror:0.028872 
## 
## Process for: Traverse City, MI 
## [1]  train-merror:0.282592+0.004375  test-merror:0.301956+0.010232 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [79] train-merror:0.047528+0.004710  test-merror:0.179227+0.007478
## 
## [1]  train-merror:0.296631 
## [79] train-merror:0.060901 
## 
## Process for: Washington, DC 
## [1]  train-merror:0.325065+0.007524  test-merror:0.351296+0.009522 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.052552+0.003684  test-merror:0.195575+0.001873
## 
## [1]  train-merror:0.347490 
## [78] train-merror:0.061861
# Name the list
names(localeOnevAll) <- useLocs

Error statistics can then be calculated and reported:

# Report on summary statistics
# Test error by CV file
cvError <- map_dfr(localeOnevAll, 
                   .f=function(x) {x$cvResult$xgbModel$evaluation_log[x$bestN, ]}, 
                   .id="locale"
                   )

# Function to extract key error data
dummyError <- function(x) {
    df <- x$mdlResult$testData %>%
        mutate(correct=ifelse(predicted==curLocale, 1, 0), 
               self=ifelse(curLocale=="All Other", 0, 1), 
               pctCorrectOverall=mean(correct)
               ) %>%
        group_by(self) %>%
        summarize(pctCorrectGroup=mean(correct), pctCorrectOverall=mean(pctCorrectOverall)) %>%
        ungroup()
}

# Test error by model file
mdlError <- map_dfr(localeOnevAll, 
                    .f=dummyError,
                    .id="locale"
                    )

# Plot of error rates - overall for CV and model
mdlError %>%
    inner_join(cvError) %>%
    filter(self==1) %>%
    mutate(pctCorrectCV=1-test_merror_mean) %>%
    select(locale, Overall=pctCorrectOverall, CV=pctCorrectCV) %>%
    pivot_longer(-locale) %>%
    ggplot(aes(x=fct_reorder(locale, value), y=value)) + 
    geom_point(aes(color=name)) + 
    coord_flip() + 
    labs(y="Test Accuracy", title="CV and Model Accuracy on Test Data", x="") + 
    scale_color_discrete("Model Type") + 
    ylim(c(0, 1))
## Joining, by = "locale"

# Plot of error rates - self vs. other for model
mdlError %>%
    ggplot(aes(x=fct_reorder(locale, pctCorrectOverall), y=pctCorrectGroup)) + 
    geom_point(aes(color=factor(self))) + 
    coord_flip() + 
    labs(y="Test Accuracy", title="Model Accuracy on Test Data", subtitle="Self=1, All Other=0", x="") + 
    scale_color_discrete("Classifying Self?") + 
    ylim(c(0, 1))

CV errors are generally well aligned with modeling errors, as expected. As seen previously, the model is generally more successful in classifying self and less successful in classifying all other. Suppose that just this model were used to make predictions for all of the test data:

# Prepare the test data
testData <- fullDataSplit$testData %>%
    select_at(vars(all_of(c(locXGBPreds, keepVarFull)))) %>%
    filter(locNamefct %in% useLocs) %>%
    mutate_if(is.factor, .funs=fct_drop)

# Function to create the probabilities by locale
predictLocaleProbs <- function(x) {
    helperXGBPredict(x$mdlResult$xgbModel, 
                     dfSparse=helperMakeSparse(testData, 
                                               depVar="locNamefct", 
                                               predVars=locXGBPreds
                                               ), 
                     objective="multi:softprob", 
                     probMatrix=TRUE, 
                     yLevels=x$mdlResult$yTrainLevels
                     )$probData %>%
        select(-`All Other`)
}

# Extract from list
locData <- testData %>%
    select(locNamefct, source, dtime) %>%
    bind_cols(map_dfc(localeOnevAll, .f=predictLocaleProbs))

# Convert to prediction based purely on maximum probability
sampPreds <- locData %>%
    mutate(record=row_number()) %>%
    pivot_longer(-c(record, locNamefct, source, dtime)) %>%
    group_by(record, locNamefct, source, dtime) %>%
    filter(value==max(value)) %>%
    ungroup() %>%
    mutate(correct=(as.character(locNamefct)==name))

# Overall accuracy by locale
sampPreds %>%
    group_by(locNamefct) %>%
    summarize(pctCorrect=mean(correct), n=n()) %>%
    ggplot(aes(x=fct_reorder(locNamefct, pctCorrect), y=pctCorrect)) + 
    geom_point() + 
    geom_text(aes(y=pctCorrect+0.02, label=paste0(round(100*pctCorrect), "%")), hjust=0, size=3.5) + 
    coord_flip() + 
    labs(x="Actual Locale", y="Percent Correctly Predicted", 
         title="Predictions based on Maximum Probability"
         ) + 
    ylim(c(0, 1.1))

# Confusion matrix
sampPreds %>%
    mutate(name=factor(name)) %>%
    count(locNamefct, name) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ungroup() %>%
    ggplot(aes(x=locNamefct, y=name)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%")), size=2) + 
    coord_flip() + 
    labs(x="Actual Locale", y="Predicted Locale", 
         title="Predictions based on Maximum Probability"
         ) + 
    scale_fill_continuous(low="white", high="lightgreen") + 
    theme(axis.text.x=element_text(angle=90, hjust=1))

There are meaningful classification issues with the base predictor. Next steps are to find cases where a city has multiple “likely” predictions and to use one vs. one modeling to help make those decisions.

Each predicted locale has an associated probability that can be used to help model the data. For example, a look at twenty random records shows:

# Initialize a seed so that the random sampling is consistent
set.seed(2008171324)

# Take 20 random records and see the assigned probabilities
locData %>%
    mutate(record=row_number()) %>%
    sample_n(size=20) %>%
    pivot_longer(-c(record, locNamefct, source, dtime)) %>%
    ggplot(aes(x=paste0(locNamefct, " - ", record), y=factor(name))) + 
    geom_tile(aes(fill=value)) + 
    geom_text(aes(label=paste0(round(100*value), "%")), size=2.5) + 
    coord_flip() + 
    labs(x="Actual Locale", y="Potential Locale", title="Probability Associated to Potential Locale") + 
    theme(axis.text.x=element_text(angle=90, hjust=1)) + 
    scale_fill_continuous(low="white", high="green")

It is common for a locale to have decent potential matches to several potential locales, and the highest probability is not always associated to the best prediction. Suppose that for every actual locale, the following statistics are gathered:

  • Highest predicted probability
  • Second highest predicted probability
  • Accuracy of highest predicted probability locale
  • Predicted probability of correct locale
# Calculate key data for every prediction
predStats <- locData %>%
    mutate(record=row_number()) %>%
    pivot_longer(-c(record, locNamefct, source, dtime)) %>%
    mutate(correct=(locNamefct==name)) %>%
    group_by(record, locNamefct, source, dtime) %>%
    arrange(record, locNamefct, source, dtime, -value) %>%
    summarize(highProb=nth(value, 1), 
              nextProb=nth(value, 2), 
              accHigh=nth(correct, 1),
              actualLocaleProb=sum(ifelse(correct, value, 0))
              ) %>%
    ungroup()

Summaries by locale can then be generated, to answer questions such as:

  • How often is the actual locale the highest prediction or the second highest prediction?
  • What is the distribution of the probabilities associated to prediction of actual locale?
  • What is the distribution of difference in probability between highest predicted probability and prediction of actual locale?
# Frequency by Prediction Caliber
predStats %>%
    mutate(isFirst=as.integer(accHigh), 
           isSecond=pmax(0, (actualLocaleProb==nextProb)-isFirst), 
           isClose=pmax(0, (actualLocaleProb >= highProb-0.25)-isFirst-isSecond)
           ) %>% 
    select(record, locNamefct, isFirst, isSecond, isClose) %>%
    pivot_longer(-c(record, locNamefct), names_to="type", values_to="boolean") %>%
    mutate(type=factor(type, levels=c("isClose", "isSecond", "isFirst"))) %>%
    group_by(locNamefct, type) %>%
    summarize(pct=mean(boolean)) %>%
    ggplot(aes(x=fct_reorder(locNamefct, pct, .fun=sum), y=pct)) + 
    geom_col(aes(fill=type), position="stack") + 
    coord_flip() + 
    labs(x="Actual Locale", y="Percentage of Predictions", title="Prediction Caliber by Actual Locale") + 
    theme(legend.position="bottom") + 
    scale_fill_discrete("Prediction Caliber", 
                         breaks=c("isFirst", "isSecond", "isClose"), 
                         labels=c("Highest Prob", "Second Highest Prob", "Prob Within 25%")
                         )

# Histogram of Prediction Probabilities by Actual Locale
predStats %>%
    ggplot(aes(x=actualLocaleProb)) + 
    geom_histogram(fill="lightblue") + 
    facet_wrap(~locNamefct) + 
    labs(x="", y="n", title="Predicted Probability for Actual Locale")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

# Histogram of Highest Probability minus Correct Locale Probability by Actual Locale
predStats %>%
    ggplot(aes(x=actualLocaleProb-highProb)) + 
    geom_histogram(fill="lightblue") + 
    facet_wrap(~locNamefct) + 
    labs(x="", 
         y="n", 
         title="Error in Predicted Probability for Actual Locale", 
         subtitle="Error is probability for actual locale minus highest probability for any locale"
         )
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

For some of the locales, the predicted probability is very frequently either the highest probability prediction or close to the highest probability prediction. These locales may be further enhanced by exploring the one vs one model when there is more than one high probability prediction to this subset of locales.

For other locales, the predicted probability is frequently middling, which creates challenges in pulling apart the observation from other locales. In particular, this challenge is observed for the four-season locales, which can be misclassified as each other and also as a different archetype during a peak and spiky season.

Further exploration can be made for prediction quality given the highest probability and the difference in the highest probability and the second highest probability:

roundStats <- predStats %>%
    mutate(roundProb=round(20*highProb)/20, 
           deltaProb=round(10*(highProb-nextProb))/10
           ) 

roundStats %>%
    group_by(roundProb, deltaProb) %>%
    summarize(n=n(), acc=mean(accHigh)) %>%
    ungroup() %>%
    filter(n >= 10) %>%
    ggplot(aes(x=factor(roundProb), y=factor(deltaProb))) + 
    geom_tile(aes(fill=acc)) + 
    geom_text(aes(label=paste0(round(100*acc), "%\n(n=", n, ")")), size=2.5) + 
    scale_fill_continuous(low="white", high="green") + 
    labs(x="Highest Predicted Probability", 
         y="Gap to Next Highest Predicted Probability", 
         title="Accuracy by Highest Probability vs. Next Highest Probability"
         )

There is a small subset of predictions where the predicted probability is high (97.5% plus) and the gap to the next highest probability is also high (25% plus). These predictions have very high accuracy:

# Plot of accuracy by condition
predStats %>%
    mutate(isLikely=(highProb >= 0.975) & (highProb-nextProb >= 0.25)) %>%
    group_by(isLikely) %>%
    summarize(acc=mean(accHigh), n=n()) %>%
    ungroup() %>%
    ggplot(aes(x=isLikely, y=acc)) + 
    geom_col(fill="lightblue") + 
    geom_text(aes(y=acc/2, label=paste0(round(100*acc), "%\n(n=", n, ")"))) + 
    labs(x="Highest Probability >= 0.975 and Delta Probability >= 0.25?", y="Accuracy")

# Plot of accuracy by condition
predStats %>%
    mutate(isLikely=(highProb >= 0.975) & (highProb-nextProb >= 0.25)) %>%
    group_by(locNamefct, isLikely) %>%
    summarize(n=n()) %>%
    ungroup() %>%
    ggplot(aes(x=fct_reorder(locNamefct, n, .fun=min), y=n)) + 
    geom_col(aes(fill=isLikely), position="fill") + 
    labs(y="", x="") + 
    coord_flip() + 
    theme(legend.position="bottom") + 
    scale_fill_discrete("Highest Probability >= 0.975 and Delta Probability >= 0.25?", 
                        breaks=c(TRUE, FALSE)
                        )

Further work is needed to explore the one vs all predictor to see if it has potential to be expanded using one vs one predictors or if it is overly faulty and prone to making confidently inaccurate predictions,

The correct locale is frequently reasonably close in probability to the “highest probability” locale. For example, plots of the actual probability and distance show:

helperAccHighPlot <- function(boolHigh, desc) {

    p1 <- predStats %>% 
        mutate(delta=round(-2*(actualLocaleProb-highProb), 1)/2, 
               rndActual=round(2*actualLocaleProb, 1)/2
               ) %>% 
        filter(accHigh==boolHigh) %>%
        count(rndActual, delta) %>% 
        ggplot(aes(x=rndActual, y=delta)) + 
        geom_tile(aes(fill=n)) + 
        scale_fill_continuous("# Obs", low="white", high="green") + 
        geom_text(aes(label=n), size=3.5) + 
        labs(x="Rounded Probability of Actual", 
             y="Delta of Highest Probability vs. Actual Probability", 
             title=paste0("Probabilities Associated with Actual Locale and Highest Probability Locale", desc)
             )
    print(p1)
    
}

helperAccHighPlot(boolHigh=TRUE, desc=" (Correct Predictions)")

helperAccHighPlot(boolHigh=FALSE, desc=" (Incorrect Predictions)")

Statistics can then be captured from the overall data to assess the probability of a prediction being correct conditional on the prediction falling in a given bucket of actual probability vs. delta probability:

# Find the probabilities associated with all predictions for each locale
allProbs <- locData %>%
    mutate(record=row_number()) %>%
    pivot_longer(-c(record, locNamefct, source, dtime)) %>%
    mutate(rndValue=round(20*value)/20, 
           correct=(name==locNamefct)
           ) %>%
    group_by(record, locNamefct, source, dtime) %>%
    mutate(maxValue=max(value)) %>%
    ungroup() %>%
    mutate(deltaValue=round(20*(maxValue-value))/20)

# Find the probability that the prediction is correct
allProbs %>%
    group_by(rndValue, deltaValue) %>%
    summarize(n=n(), acc=mean(correct)) %>%
    filter(n>=10) %>%
    ggplot(aes(x=rndValue, y=deltaValue)) +
    geom_tile(aes(fill=acc)) + 
    geom_text(aes(label=paste0(round(100*acc), "%")), size=3) +
    scale_fill_continuous("% Actual", low="white", high="green") + 
    labs(x="Rounded Probability", 
         y="Delta of Highest Probability vs. Probability", 
         title="Percent Correct Given Locale Probability and Delta from Highest Probability", 
         subtitle="Includes only buckets with 10+ observations"
         )

Suppose that as a process of exclusion, every locale that is further than 27.5% from the highest prediction and has a probability lower than 47.5% is removed from consideration. What is the frequency of data remaining for consideration, and how much correct data has been excluded?

excludeSummary <- allProbs %>%
    mutate(include=(deltaValue <= 0.275 | rndValue >= 0.475)) %>%
    group_by(locNamefct, source, dtime, record) %>%
    summarize(n=sum(include), hasCorrect=max(correct*include)) %>%
    ungroup()

excludeSummary %>% 
    summarize(n=mean(n), acc=mean(hasCorrect))
## # A tibble: 1 x 2
##       n   acc
##   <dbl> <dbl>
## 1  6.74 0.912

This approach reduced data volumes by roughly 75% (29 possibilities down to an average of 7) while preserving 90% of the correct predictions. How do the statistics vary by actual locale?

excludeSummary %>% 
    group_by(locNamefct) %>%
    summarize(n=mean(n), acc=mean(hasCorrect)) %>%
    ggplot(aes(x=fct_reorder(locNamefct, acc), y=acc)) + 
    geom_col(fill="lightblue") + 
    geom_text(aes(y=acc/2, label=paste0(round(100*acc), "% (avg.n is ", round(n, 1), ")"))) + 
    coord_flip() + 
    labs(x="", y="Percent Included")

For all locales, a strong majority are at least considered for the next step.

An additional approach is to train multiple one-vs-one models and to assess the number of votes received. A smaller subset of the data is used as a starting point, with two cities per archetype:

# Define a set of locales
oneoneLocs <- c("Chicago, IL", "Milwaukee, WI", 
                "Las Vegas, NV", "Phoenix, AZ", 
                "Houston, TX", "New Orleans, LA", 
                "Philadelphia, PA", "Newark, NJ",
                "San Diego, CA", "Los Angeles, CA"
                )

# Create a container for storing all relevant objects
localeOnevOne <- vector("list", choose(length(oneoneLocs), 2))

# Run the modeling process once for each locale comparison
n <- 1
for (firstLoc in oneoneLocs[1:length(oneoneLocs)-1]) {
    for (secondLoc in oneoneLocs[(match(firstLoc, oneoneLocs)+1):length(oneoneLocs)]) {
    
        # Announce the progress
        cat("\nProcess for:", firstLoc, "and", secondLoc, "\n")
    
        # Create the training data to use for the model
        thisTrain <- localeTrainData %>%
            filter(locNamefct %in% c(firstLoc, secondLoc)) %>%
            mutate(curLocale=factor(locNamefct, levels=c(firstLoc, secondLoc)))
    
        # Find the samllest group size
        smallN <- thisTrain %>%
            count(curLocale) %>%
            pull(n) %>%
            min()
    
        # Balance the samples in thisTrain
        thisTrain <- thisTrain %>%
            group_by(curLocale) %>%
            sample_n(smallN) %>%
            ungroup()
    
        # Run the CV process with a callback for early stopping if 5 iterations show no improvement
        xgb_thislocale_cv <- xgbRunModel_002(thisTrain, 
                                             depVar="curLocale", 
                                             predVars=locXGBPreds, 
                                             otherVars=keepVarFull, 
                                             seed=2008181350+n,
                                             nrounds=500,
                                             print_every_n=500, 
                                             xgbObjective="multi:softmax", 
                                             funcRun=xgboost::xgb.cv, 
                                             nfold=5, 
                                             num_class=2, 
                                             early_stopping_rounds=5
                                             )

        # The best iteration can then be pulled
        bestN <- xgb_thislocale_cv$xgbModel$best_iteration
    
        # And the model can be run for that number of iterations
        xgb_thislocale <- xgbRunModel_002(thisTrain, 
                                          depVar="curLocale", 
                                          predVars=locXGBPreds, 
                                          otherVars=keepVarFull, 
                                          seed=2008181350+1000+n,
                                          nrounds=bestN,
                                          print_every_n=500, 
                                          xgbObjective="multi:softprob", 
                                          funcRun=xgboost::xgboost, 
                                          num_class=2
                                          )
    
    
        # Place the trained CV object in the relevant list
        localeOnevOne[[n]] <- list(cvResult=xgb_thislocale_cv, 
                                   mdlResult=xgb_thislocale, 
                                   bestN=bestN, 
                                   firstLoc=firstLoc, 
                                   secondLoc=secondLoc
                                   )
    
        # Increment the counter
        n <- n + 1
    
    }
}
## 
## Process for: Chicago, IL and Milwaukee, WI 
## [1]  train-merror:0.415757+0.005393  test-merror:0.456690+0.010868 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [18] train-merror:0.238410+0.005699  test-merror:0.395423+0.007691
## 
## [1]  train-merror:0.423944 
## [18] train-merror:0.256455 
## 
## Process for: Chicago, IL and Las Vegas, NV 
## [1]  train-merror:0.065836+0.001132  test-merror:0.081994+0.005316 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.000000+0.000000  test-merror:0.008915+0.001547
## 
## [1]  train-merror:0.070029 
## [71] train-merror:0.000000 
## 
## Process for: Chicago, IL and Phoenix, AZ 
## [1]  train-merror:0.071131+0.002832  test-merror:0.082618+0.003579 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [84] train-merror:0.000000+0.000000  test-merror:0.007522+0.001310
## 
## [1]  train-merror:0.065225 
## [84] train-merror:0.000000 
## 
## Process for: Chicago, IL and Houston, TX 
## [1]  train-merror:0.145383+0.002374  test-merror:0.159582+0.005846 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.002294+0.000463  test-merror:0.026597+0.003015
## 
## [1]  train-merror:0.161324 
## [63] train-merror:0.003368 
## 
## Process for: Chicago, IL and New Orleans, LA 
## [1]  train-merror:0.130342+0.015590  test-merror:0.136635+0.016406 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.000410+0.000234  test-merror:0.014284+0.001760
## 
## [1]  train-merror:0.144948 
## [63] train-merror:0.000351 
## 
## Process for: Chicago, IL and Philadelphia, PA 
## [1]  train-merror:0.347480+0.013769  test-merror:0.369582+0.016109 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [84] train-merror:0.029347+0.003220  test-merror:0.158539+0.008183
## 
## [1]  train-merror:0.345493 
## [84] train-merror:0.030749 
## 
## Process for: Chicago, IL and Newark, NJ 
## [1]  train-merror:0.347927+0.020626  test-merror:0.373084+0.022635 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [108]    train-merror:0.015894+0.001260  test-merror:0.155797+0.009895
## 
## [1]  train-merror:0.377271 
## [108]    train-merror:0.020610 
## 
## Process for: Chicago, IL and San Diego, CA 
## [1]  train-merror:0.128123+0.002992  test-merror:0.136073+0.008106 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [92] train-merror:0.000381+0.000150  test-merror:0.019824+0.002274
## 
## [1]  train-merror:0.119765 
## [92] train-merror:0.000704 
## 
## Process for: Chicago, IL and Los Angeles, CA 
## [1]  train-merror:0.121590+0.002700  test-merror:0.135822+0.013069 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [84] train-merror:0.002156+0.000444  test-merror:0.029764+0.004143
## 
## [1]  train-merror:0.127318 
## [84] train-merror:0.002835 
## 
## Process for: Milwaukee, WI and Las Vegas, NV 
## [1]  train-merror:0.054460+0.002218  test-merror:0.065375+0.006482 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [52] train-merror:0.000176+0.000144  test-merror:0.013615+0.002042
## 
## [1]  train-merror:0.057512 
## [52] train-merror:0.000117 
## 
## Process for: Milwaukee, WI and Phoenix, AZ 
## [1]  train-merror:0.059819+0.002802  test-merror:0.070867+0.004717 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.000000+0.000000  test-merror:0.010342+0.001093
## 
## [1]  train-merror:0.057116 
## [61] train-merror:0.000118 
## 
## Process for: Milwaukee, WI and Houston, TX 
## [1]  train-merror:0.134507+0.010224  test-merror:0.143896+0.006685 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [41] train-merror:0.004372+0.000733  test-merror:0.022300+0.004248
## 
## [1]  train-merror:0.134155 
## [41] train-merror:0.005986 
## 
## Process for: Milwaukee, WI and New Orleans, LA 
## [1]  train-merror:0.121332+0.009653  test-merror:0.127935+0.010600 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.000117+0.000171  test-merror:0.013497+0.002969
## 
## [1]  train-merror:0.121244 
## [67] train-merror:0.000469 
## 
## Process for: Milwaukee, WI and Philadelphia, PA 
## [1]  train-merror:0.321714+0.003897  test-merror:0.343545+0.015569 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [96] train-merror:0.019748+0.001451  test-merror:0.146009+0.008715
## 
## [1]  train-merror:0.315845 
## [96] train-merror:0.023122 
## 
## Process for: Milwaukee, WI and Newark, NJ 
## [1]  train-merror:0.333069+0.026646  test-merror:0.361502+0.030641 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [85] train-merror:0.027817+0.000710  test-merror:0.161972+0.007301
## 
## [1]  train-merror:0.322653 
## [85] train-merror:0.029812 
## 
## Process for: Milwaukee, WI and San Diego, CA 
## [1]  train-merror:0.107600+0.004034  test-merror:0.116080+0.007473 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [37] train-merror:0.008187+0.000611  test-merror:0.026995+0.003767
## 
## [1]  train-merror:0.112793 
## [37] train-merror:0.009038 
## 
## Process for: Milwaukee, WI and Los Angeles, CA 
## [1]  train-merror:0.125517+0.002219  test-merror:0.138301+0.009369 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.006644+0.001002  test-merror:0.034015+0.005424
## 
## [1]  train-merror:0.136412 
## [61] train-merror:0.006141 
## 
## Process for: Las Vegas, NV and Phoenix, AZ 
## [1]  train-merror:0.227465+0.006257  test-merror:0.250447+0.020623 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.020126+0.001899  test-merror:0.101304+0.005165
## 
## [1]  train-merror:0.220355 
## [71] train-merror:0.024562 
## 
## Process for: Las Vegas, NV and Houston, TX 
## [1]  train-merror:0.042610+0.001402  test-merror:0.056776+0.004400 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [40] train-merror:0.001320+0.000425  test-merror:0.015953+0.002759
## 
## [1]  train-merror:0.042229 
## [40] train-merror:0.001877 
## 
## Process for: Las Vegas, NV and New Orleans, LA 
## [1]  train-merror:0.031114+0.001296  test-merror:0.044340+0.006662 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [59] train-merror:0.000029+0.000059  test-merror:0.011613+0.001715
## 
## [1]  train-merror:0.032845 
## [59] train-merror:0.000000 
## 
## Process for: Las Vegas, NV and Philadelphia, PA 
## [1]  train-merror:0.102463+0.004815  test-merror:0.121525+0.006525 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [80] train-merror:0.000029+0.000059  test-merror:0.019707+0.001913
## 
## [1]  train-merror:0.097947 
## [80] train-merror:0.000117 
## 
## Process for: Las Vegas, NV and Newark, NJ 
## [1]  train-merror:0.094018+0.003993  test-merror:0.110614+0.004486 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [103]    train-merror:0.000000+0.000000  test-merror:0.017479+0.001510
## 
## [1]  train-merror:0.097830 
## [103]    train-merror:0.000000 
## 
## Process for: Las Vegas, NV and San Diego, CA 
## [1]  train-merror:0.058182+0.001951  test-merror:0.072023+0.005351 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.000117+0.000110  test-merror:0.010088+0.002415
## 
## [1]  train-merror:0.058065 
## [72] train-merror:0.000117 
## 
## Process for: Las Vegas, NV and Los Angeles, CA 
## [1]  train-merror:0.076651+0.002851  test-merror:0.090351+0.010531 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.000089+0.000073  test-merror:0.019605+0.003305
## 
## [1]  train-merror:0.075706 
## [63] train-merror:0.000118 
## 
## Process for: Phoenix, AZ and Houston, TX 
## [1]  train-merror:0.054531+0.006234  test-merror:0.066869+0.006400 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [81] train-merror:0.000000+0.000000  test-merror:0.013633+0.001839
## 
## [1]  train-merror:0.052768 
## [81] train-merror:0.000000 
## 
## Process for: Phoenix, AZ and New Orleans, LA 
## [1]  train-merror:0.046833+0.001607  test-merror:0.060289+0.007518 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.000000+0.000000  test-merror:0.007522+0.001360
## 
## [1]  train-merror:0.045599 
## [75] train-merror:0.000000 
## 
## Process for: Phoenix, AZ and Philadelphia, PA 
## [1]  train-merror:0.087496+0.004500  test-merror:0.100835+0.004496 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.000029+0.000059  test-merror:0.012927+0.002600
## 
## [1]  train-merror:0.108003 
## [67] train-merror:0.000235 
## 
## Process for: Phoenix, AZ and Newark, NJ 
## [1]  train-merror:0.092255+0.002013  test-merror:0.106359+0.005649 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [86] train-merror:0.000000+0.000000  test-merror:0.012105+0.003036
## 
## [1]  train-merror:0.092490 
## [86] train-merror:0.000000 
## 
## Process for: Phoenix, AZ and San Diego, CA 
## [1]  train-merror:0.086173+0.001604  test-merror:0.091080+0.005937 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.000000+0.000000  test-merror:0.010929+0.000954
## 
## [1]  train-merror:0.089905 
## [62] train-merror:0.000118 
## 
## Process for: Phoenix, AZ and Los Angeles, CA 
## [1]  train-merror:0.091679+0.001749  test-merror:0.106885+0.011851 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [88] train-merror:0.000000+0.000000  test-merror:0.018661+0.002861
## 
## [1]  train-merror:0.099917 
## [88] train-merror:0.000000 
## 
## Process for: Houston, TX and New Orleans, LA 
## [1]  train-merror:0.357452+0.008591  test-merror:0.370798+0.009303 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [100]    train-merror:0.006791+0.000937  test-merror:0.074230+0.005073
## 
## [1]  train-merror:0.364946 
## [100]    train-merror:0.009367 
## 
## Process for: Houston, TX and Philadelphia, PA 
## [1]  train-merror:0.174589+0.020459  test-merror:0.188704+0.015664 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.017362+0.000650  test-merror:0.056705+0.006550
## 
## [1]  train-merror:0.208348 
## [39] train-merror:0.015784 
## 
## Process for: Houston, TX and Newark, NJ 
## [1]  train-merror:0.143864+0.003893  test-merror:0.160457+0.006093 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [33] train-merror:0.016331+0.001006  test-merror:0.045877+0.004611
## 
## [1]  train-merror:0.141942 
## [33] train-merror:0.020610 
## 
## Process for: Houston, TX and San Diego, CA 
## [1]  train-merror:0.116628+0.002976  test-merror:0.127038+0.012396 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [42] train-merror:0.011496+0.000984  test-merror:0.036129+0.007715
## 
## [1]  train-merror:0.125630 
## [42] train-merror:0.011144 
## 
## Process for: Houston, TX and Los Angeles, CA 
## [1]  train-merror:0.112289+0.006424  test-merror:0.123655+0.006831 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.008120+0.001036  test-merror:0.039092+0.006112
## 
## [1]  train-merror:0.110901 
## [53] train-merror:0.009685 
## 
## Process for: New Orleans, LA and Philadelphia, PA 
## [1]  train-merror:0.151007+0.003771  test-merror:0.166139+0.004340 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [79] train-merror:0.001054+0.000234  test-merror:0.033368+0.001661
## 
## [1]  train-merror:0.141318 
## [79] train-merror:0.001990 
## 
## Process for: New Orleans, LA and Newark, NJ 
## [1]  train-merror:0.137894+0.001531  test-merror:0.149045+0.009430 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.002576+0.000420  test-merror:0.030208+0.003679
## 
## [1]  train-merror:0.137220 
## [62] train-merror:0.002693 
## 
## Process for: New Orleans, LA and San Diego, CA 
## [1]  train-merror:0.107654+0.003266  test-merror:0.119063+0.005953 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.003021+0.000897  test-merror:0.026276+0.002503
## 
## [1]  train-merror:0.101232 
## [66] train-merror:0.003754 
## 
## Process for: New Orleans, LA and Los Angeles, CA 
## [1]  train-merror:0.083028+0.005390  test-merror:0.096731+0.008411 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [73] train-merror:0.002096+0.000548  test-merror:0.025629+0.001855
## 
## [1]  train-merror:0.080666 
## [73] train-merror:0.001417 
## 
## Process for: Philadelphia, PA and Newark, NJ 
## [1]  train-merror:0.407577+0.004706  test-merror:0.435173+0.011766 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.182275+0.002804  test-merror:0.367238+0.015633
## 
## [1]  train-merror:0.407927 
## [39] train-merror:0.195019 
## 
## Process for: Philadelphia, PA and San Diego, CA 
## [1]  train-merror:0.096598+0.001537  test-merror:0.102756+0.005151 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.003285+0.000704  test-merror:0.032022+0.003195
## 
## [1]  train-merror:0.096891 
## [67] train-merror:0.004809 
## 
## Process for: Philadelphia, PA and Los Angeles, CA 
## [1]  train-merror:0.123804+0.002261  test-merror:0.134403+0.008948 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [65] train-merror:0.005138+0.000981  test-merror:0.038502+0.003834
## 
## [1]  train-merror:0.126609 
## [65] train-merror:0.006378 
## 
## Process for: Newark, NJ and San Diego, CA 
## [1]  train-merror:0.115601+0.006532  test-merror:0.127390+0.008396 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [50] train-merror:0.006129+0.000485  test-merror:0.031320+0.004730
## 
## [1]  train-merror:0.109326 
## [50] train-merror:0.007390 
## 
## Process for: Newark, NJ and Los Angeles, CA 
## [1]  train-merror:0.118431+0.004545  test-merror:0.132988+0.008668 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [54] train-merror:0.009655+0.000424  test-merror:0.040510+0.001320
## 
## [1]  train-merror:0.124838 
## [54] train-merror:0.010866 
## 
## Process for: San Diego, CA and Los Angeles, CA 
## [1]  train-merror:0.250798+0.016797  test-merror:0.265028+0.009821 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.054004+0.004199  test-merror:0.159561+0.006491
## 
## [1]  train-merror:0.251683 
## [60] train-merror:0.059171

Suppose then that the test data file is subset to only these 10 locales, and that each one vs. one prediction is run on the data subset:

# Prepare the test data
testOneOne <- fullDataSplit$testData %>%
    select_at(vars(all_of(c(locXGBPreds, keepVarFull)))) %>%
    filter(locNamefct %in% oneoneLocs) %>%
    mutate_if(is.factor, .funs=fct_drop)

# Function to create the probabilities by locale
predictOneOneProbs <- function(x) {
    helperXGBPredict(x$mdlResult$xgbModel, 
                     dfSparse=helperMakeSparse(testOneOne, 
                                               depVar="locNamefct", 
                                               predVars=locXGBPreds
                                               ), 
                     objective="multi:softprob", 
                     probMatrix=TRUE, 
                     yLevels=x$mdlResult$yTrainLevels
                     )$predData %>%
        mutate(rownum=row_number(), predicted=as.character(predicted))
}

# Extract from list
allOneOneProbs <- map_dfr(localeOnevOne, .f=predictOneOneProbs, .id="Run")

# Make the prediction based on 1) most votes, then 2) highest probability when in majority
locOneOne <- allOneOneProbs %>%
    group_by(rownum, predicted) %>%
    summarize(n=n(), highProb=sum(probPredicted >= 0.9), majProb=sum(probPredicted)) %>%
    filter(n==max(n)) %>%
    ungroup() %>%
    arrange(rownum, -highProb, -majProb) %>%
    group_by(rownum) %>%
    filter(row_number()==1) %>%
    ungroup() %>%
    bind_cols(select(testOneOne, locNamefct, source, dtime))

# Overall accuracy by locale
locOneOne %>%
    mutate(correct=predicted==locNamefct) %>%
    group_by(locNamefct) %>%
    summarize(pctCorrect=mean(correct), n=n()) %>%
    ggplot(aes(x=fct_reorder(locNamefct, pctCorrect), y=pctCorrect)) + 
    geom_point() + 
    geom_text(aes(y=pctCorrect+0.02, label=paste0(round(100*pctCorrect), "%")), hjust=0, size=3.5) + 
    coord_flip() + 
    labs(x="Actual Locale", 
         y="Percent Correctly Predicted", 
         title="Predictions based on Maximum Probability"
         ) + 
    ylim(c(0, 1.1))

# Confusion matrix
locOneOne %>%
    mutate(name=factor(predicted)) %>%
    count(locNamefct, name) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ungroup() %>%
    ggplot(aes(x=locNamefct, y=name)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%")), size=2.5) + 
    coord_flip() + 
    labs(x="Actual Locale", 
         y="Predicted Locale", 
         title="Predictions based on Maximum Probability"
         ) + 
    scale_fill_continuous(low="white", high="lightgreen") + 
    theme(axis.text.x=element_text(angle=90, hjust=1))

Locales are being classified to the proper archetype, which is encouraging. Accuracy is improved, though likely driven by having only 10 categories rather than 30. The process is expanded to consider all 2016 locales:

# Define a set of locales
oneoneLocs_002 <- locs2016

# Create a container for storing all relevant objects
localeOnevOne_002 <- vector("list", choose(length(oneoneLocs_002), 2))

# Run the modeling process once for each locale comparison
n <- 1
for (firstLoc in oneoneLocs_002[1:length(oneoneLocs_002)-1]) {
    for (secondLoc in oneoneLocs_002[(match(firstLoc, oneoneLocs_002)+1):length(oneoneLocs_002)]) {
    
        # Announce the progress
        cat("\nProcess for:", firstLoc, "and", secondLoc, "\n")
    
        # Create the training data to use for the model
        thisTrain <- localeTrainData %>%
            filter(locNamefct %in% c(firstLoc, secondLoc)) %>%
            mutate(curLocale=factor(locNamefct, levels=c(firstLoc, secondLoc)))
    
        # Find the samllest group size
        smallN <- thisTrain %>%
            count(curLocale) %>%
            pull(n) %>%
            min()
    
        # Balance the samples in thisTrain
        thisTrain <- thisTrain %>%
            group_by(curLocale) %>%
            sample_n(smallN) %>%
            ungroup()
    
        # Run the CV process with a callback for early stopping if 5 iterations show no improvement
        xgb_thislocale_cv <- xgbRunModel_002(thisTrain, 
                                             depVar="curLocale", 
                                             predVars=locXGBPreds, 
                                             otherVars=keepVarFull, 
                                             seed=2008181350+n,
                                             nrounds=500,
                                             print_every_n=500, 
                                             xgbObjective="multi:softmax", 
                                             funcRun=xgboost::xgb.cv, 
                                             nfold=5, 
                                             num_class=2, 
                                             early_stopping_rounds=5
                                             )

        # The best iteration can then be pulled
        bestN <- xgb_thislocale_cv$xgbModel$best_iteration
    
        # And the model can be run for that number of iterations
        xgb_thislocale <- xgbRunModel_002(thisTrain, 
                                          depVar="curLocale", 
                                          predVars=locXGBPreds, 
                                          otherVars=keepVarFull, 
                                          seed=2008181350+1000+n,
                                          nrounds=bestN,
                                          print_every_n=500, 
                                          xgbObjective="multi:softprob", 
                                          funcRun=xgboost::xgboost, 
                                          num_class=2
                                          )
    
    
        # Place the trained CV object in the relevant list
        localeOnevOne_002[[n]] <- list(cvResult=xgb_thislocale_cv, 
                                       mdlResult=xgb_thislocale, 
                                       bestN=bestN, 
                                       firstLoc=firstLoc, 
                                       secondLoc=secondLoc
                                       )
    
        # Increment the counter
        n <- n + 1
    
    }
}
## 
## Process for: Atlanta, GA and Boston, MA 
## [1]  train-merror:0.236186+0.019874  test-merror:0.258851+0.013790 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.008936+0.000812  test-merror:0.057236+0.005063
## 
## [1]  train-merror:0.238290 
## [66] train-merror:0.009228 
## 
## Process for: Atlanta, GA and Chicago, IL 
## [1]  train-merror:0.261463+0.008519  test-merror:0.280781+0.015822 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [91] train-merror:0.004274+0.001003  test-merror:0.063985+0.008151
## 
## [1]  train-merror:0.269577 
## [91] train-merror:0.004158 
## 
## Process for: Atlanta, GA and Dallas, TX 
## [1]  train-merror:0.231669+0.001546  test-merror:0.248860+0.006953 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [138]    train-merror:0.005877+0.001512  test-merror:0.126301+0.010383
## 
## [1]  train-merror:0.236347 
## [138]    train-merror:0.010642 
## 
## Process for: Atlanta, GA and Denver, CO 
## [1]  train-merror:0.107986+0.002131  test-merror:0.118994+0.002688 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [50] train-merror:0.000177+0.000236  test-merror:0.012396+0.002242
## 
## [1]  train-merror:0.113918 
## [50] train-merror:0.000472 
## 
## Process for: Atlanta, GA and Detroit, MI 
## [1]  train-merror:0.260908+0.003890  test-merror:0.270621+0.005449 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [106]    train-merror:0.000727+0.000460  test-merror:0.060850+0.002361
## 
## [1]  train-merror:0.263409 
## [106]    train-merror:0.002443 
## 
## Process for: Atlanta, GA and Grand Rapids, MI 
## [1]  train-merror:0.263931+0.002912  test-merror:0.281427+0.010262 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.019772+0.001357  test-merror:0.072665+0.011007
## 
## [1]  train-merror:0.263435 
## [49] train-merror:0.022780 
## 
## Process for: Atlanta, GA and Green Bay, WI 
## [1]  train-merror:0.211356+0.003627  test-merror:0.224189+0.008836 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.002256+0.000255  test-merror:0.034806+0.003327
## 
## [1]  train-merror:0.217860 
## [78] train-merror:0.003281 
## 
## Process for: Atlanta, GA and Houston, TX 
## [1]  train-merror:0.253339+0.002781  test-merror:0.266318+0.005229 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [104]    train-merror:0.001045+0.000214  test-merror:0.049361+0.004790
## 
## [1]  train-merror:0.243322 
## [104]    train-merror:0.001974 
## 
## Process for: Atlanta, GA and Indianapolis, IN 
## [1]  train-merror:0.310086+0.003728  test-merror:0.331249+0.015942 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.045513+0.003199  test-merror:0.131967+0.005101
## 
## [1]  train-merror:0.319801 
## [49] train-merror:0.046264 
## 
## Process for: Atlanta, GA and Las Vegas, NV 
## [1]  train-merror:0.080264+0.004268  test-merror:0.096305+0.005185 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [23] train-merror:0.012434+0.000520  test-merror:0.039414+0.004120
## 
## [1]  train-merror:0.075777 
## [23] train-merror:0.012082 
## 
## Process for: Atlanta, GA and Lincoln, NE 
## [1]  train-merror:0.217256+0.003157  test-merror:0.236878+0.009395 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.018136+0.001916  test-merror:0.079892+0.006049
## 
## [1]  train-merror:0.217868 
## [62] train-merror:0.024843 
## 
## Process for: Atlanta, GA and Los Angeles, CA 
## [1]  train-merror:0.138981+0.004510  test-merror:0.154248+0.009464 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.006407+0.000784  test-merror:0.042045+0.006483
## 
## [1]  train-merror:0.137829 
## [57] train-merror:0.009330 
## 
## Process for: Atlanta, GA and Madison, WI 
## [1]  train-merror:0.251286+0.005369  test-merror:0.263214+0.007023 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.004275+0.000666  test-merror:0.049509+0.005475
## 
## [1]  train-merror:0.255680 
## [75] train-merror:0.005979 
## 
## Process for: Atlanta, GA and Miami, FL 
## [1]  train-merror:0.109388+0.005293  test-merror:0.122925+0.008761 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.000087+0.000071  test-merror:0.012780+0.002080
## 
## [1]  train-merror:0.102475 
## [66] train-merror:0.000000 
## 
## Process for: Atlanta, GA and Milwaukee, WI 
## [1]  train-merror:0.280458+0.002439  test-merror:0.301878+0.011799 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [112]    train-merror:0.001614+0.000445  test-merror:0.047300+0.004890
## 
## [1]  train-merror:0.201643 
## [112]    train-merror:0.001408 
## 
## Process for: Atlanta, GA and Minneapolis, MN 
## [1]  train-merror:0.205045+0.002670  test-merror:0.214418+0.011085 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [95] train-merror:0.001823+0.000435  test-merror:0.043392+0.007691
## 
## [1]  train-merror:0.209327 
## [95] train-merror:0.003819 
## 
## Process for: Atlanta, GA and New Orleans, LA 
## [1]  train-merror:0.244175+0.017262  test-merror:0.260040+0.015446 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.001376+0.000545  test-merror:0.028217+0.002879
## 
## [1]  train-merror:0.235921 
## [67] train-merror:0.002459 
## 
## Process for: Atlanta, GA and Newark, NJ 
## [1]  train-merror:0.254832+0.003689  test-merror:0.271538+0.018292 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [96] train-merror:0.004454+0.000534  test-merror:0.059386+0.006467
## 
## [1]  train-merror:0.256171 
## [96] train-merror:0.007452 
## 
## Process for: Atlanta, GA and Philadelphia, PA 
## [1]  train-merror:0.275635+0.003277  test-merror:0.293697+0.008725 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.009266+0.000749  test-merror:0.071555+0.007353
## 
## [1]  train-merror:0.293698 
## [78] train-merror:0.012276 
## 
## Process for: Atlanta, GA and Phoenix, AZ 
## [1]  train-merror:0.087495+0.003530  test-merror:0.107415+0.007607 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [77] train-merror:0.000088+0.000072  test-merror:0.025855+0.004088
## 
## [1]  train-merror:0.089200 
## [77] train-merror:0.000235 
## 
## Process for: Atlanta, GA and Saint Louis, MO 
## [1]  train-merror:0.316847+0.012021  test-merror:0.328486+0.017338 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.059194+0.004036  test-merror:0.175443+0.004258
## 
## [1]  train-merror:0.323678 
## [61] train-merror:0.063797 
## 
## Process for: Atlanta, GA and San Antonio, TX 
## [1]  train-merror:0.229248+0.007434  test-merror:0.244421+0.014902 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [114]    train-merror:0.000000+0.000000  test-merror:0.015351+0.002921
## 
## [1]  train-merror:0.221041 
## [114]    train-merror:0.000000 
## 
## Process for: Atlanta, GA and San Diego, CA 
## [1]  train-merror:0.132493+0.005881  test-merror:0.141934+0.006025 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [77] train-merror:0.000352+0.000117  test-merror:0.022874+0.003652
## 
## [1]  train-merror:0.119883 
## [77] train-merror:0.001408 
## 
## Process for: Atlanta, GA and San Francisco, CA 
## [1]  train-merror:0.116245+0.004042  test-merror:0.130665+0.005839 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.007534+0.001594  test-merror:0.052148+0.003701
## 
## [1]  train-merror:0.118658 
## [67] train-merror:0.010006 
## 
## Process for: Atlanta, GA and San Jose, CA 
## [1]  train-merror:0.147975+0.000792  test-merror:0.160229+0.007496 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [74] train-merror:0.009103+0.001066  test-merror:0.054733+0.004005
## 
## [1]  train-merror:0.145408 
## [74] train-merror:0.010970 
## 
## Process for: Atlanta, GA and Seattle, WA 
## [1]  train-merror:0.163292+0.008910  test-merror:0.174115+0.015599 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.000379+0.000327  test-merror:0.023107+0.003170
## 
## [1]  train-merror:0.159762 
## [75] train-merror:0.000934 
## 
## Process for: Atlanta, GA and Tampa Bay, FL 
## [1]  train-merror:0.237346+0.019561  test-merror:0.253886+0.021428 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [87] train-merror:0.000146+0.000092  test-merror:0.025716+0.001383
## 
## [1]  train-merror:0.254705 
## [87] train-merror:0.000468 
## 
## Process for: Atlanta, GA and Traverse City, MI 
## [1]  train-merror:0.185886+0.007760  test-merror:0.207943+0.011316 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [64] train-merror:0.003763+0.000703  test-merror:0.041912+0.006852
## 
## [1]  train-merror:0.182355 
## [64] train-merror:0.004863 
## 
## Process for: Atlanta, GA and Washington, DC 
## [1]  train-merror:0.277153+0.005260  test-merror:0.298598+0.009093 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [106]    train-merror:0.005175+0.000575  test-merror:0.070426+0.003769
## 
## [1]  train-merror:0.265168 
## [106]    train-merror:0.007495 
## 
## Process for: Boston, MA and Chicago, IL 
## [1]  train-merror:0.370781+0.007569  test-merror:0.382665+0.006217 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [142]    train-merror:0.007738+0.000856  test-merror:0.123000+0.003712
## 
## [1]  train-merror:0.372503 
## [142]    train-merror:0.013083 
## 
## Process for: Boston, MA and Dallas, TX 
## [1]  train-merror:0.186265+0.003142  test-merror:0.203247+0.012819 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.008859+0.001239  test-merror:0.046192+0.004697
## 
## [1]  train-merror:0.188516 
## [49] train-merror:0.008771 
## 
## Process for: Boston, MA and Denver, CO 
## [1]  train-merror:0.174123+0.010882  test-merror:0.196438+0.012161 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.000620+0.000342  test-merror:0.025264+0.003577
## 
## [1]  train-merror:0.198914 
## [75] train-merror:0.002007 
## 
## Process for: Boston, MA and Detroit, MI 
## [1]  train-merror:0.296782+0.003974  test-merror:0.317604+0.012222 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [105]    train-merror:0.023683+0.001221  test-merror:0.155356+0.007618
## 
## [1]  train-merror:0.304287 
## [105]    train-merror:0.028268 
## 
## Process for: Boston, MA and Grand Rapids, MI 
## [1]  train-merror:0.307534+0.005185  test-merror:0.333178+0.007371 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [115]    train-merror:0.014136+0.001661  test-merror:0.130726+0.006336
## 
## [1]  train-merror:0.303622 
## [115]    train-merror:0.021262 
## 
## Process for: Boston, MA and Green Bay, WI 
## [1]  train-merror:0.277833+0.002778  test-merror:0.301419+0.010865 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [95] train-merror:0.015118+0.001680  test-merror:0.130903+0.011053
## 
## [1]  train-merror:0.283839 
## [95] train-merror:0.022970 
## 
## Process for: Boston, MA and Houston, TX 
## [1]  train-merror:0.126095+0.003369  test-merror:0.139356+0.014875 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [45] train-merror:0.002044+0.000206  test-merror:0.020558+0.003004
## 
## [1]  train-merror:0.126153 
## [45] train-merror:0.004205 
## 
## Process for: Boston, MA and Indianapolis, IN 
## [1]  train-merror:0.312463+0.005940  test-merror:0.336994+0.012251 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [122]    train-merror:0.007855+0.000817  test-merror:0.112019+0.008747
## 
## [1]  train-merror:0.303352 
## [122]    train-merror:0.010046 
## 
## Process for: Boston, MA and Las Vegas, NV 
## [1]  train-merror:0.076276+0.004284  test-merror:0.091379+0.007229 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [79] train-merror:0.000029+0.000059  test-merror:0.013139+0.001916
## 
## [1]  train-merror:0.076950 
## [79] train-merror:0.000000 
## 
## Process for: Boston, MA and Lincoln, NE 
## [1]  train-merror:0.242846+0.004032  test-merror:0.271464+0.006852 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [108]    train-merror:0.005607+0.000644  test-merror:0.086672+0.007724
## 
## [1]  train-merror:0.249153 
## [108]    train-merror:0.007943 
## 
## Process for: Boston, MA and Los Angeles, CA 
## [1]  train-merror:0.123775+0.003820  test-merror:0.127434+0.005913 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [36] train-merror:0.014999+0.001075  test-merror:0.037085+0.003901
## 
## [1]  train-merror:0.132160 
## [36] train-merror:0.015590 
## 
## Process for: Boston, MA and Madison, WI 
## [1]  train-merror:0.269074+0.006032  test-merror:0.295024+0.009382 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [102]    train-merror:0.016653+0.002118  test-merror:0.126762+0.010452
## 
## [1]  train-merror:0.282229 
## [102]    train-merror:0.020450 
## 
## Process for: Boston, MA and Miami, FL 
## [1]  train-merror:0.046636+0.012552  test-merror:0.053263+0.016635 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [26] train-merror:0.001460+0.000333  test-merror:0.007592+0.001476
## 
## [1]  train-merror:0.052447 
## [26] train-merror:0.001635 
## 
## Process for: Boston, MA and Milwaukee, WI 
## [1]  train-merror:0.348591+0.013981  test-merror:0.375120+0.017186 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [185]    train-merror:0.002406+0.000626  test-merror:0.128522+0.007387
## 
## [1]  train-merror:0.337559 
## [185]    train-merror:0.005047 
## 
## Process for: Boston, MA and Minneapolis, MN 
## [1]  train-merror:0.297891+0.004120  test-merror:0.322043+0.012914 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [106]    train-merror:0.014572+0.001265  test-merror:0.114238+0.005792
## 
## [1]  train-merror:0.298446 
## [106]    train-merror:0.022194 
## 
## Process for: Boston, MA and New Orleans, LA 
## [1]  train-merror:0.109881+0.005154  test-merror:0.120713+0.007403 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.000117+0.000171  test-merror:0.013698+0.002417
## 
## [1]  train-merror:0.111462 
## [61] train-merror:0.000702 
## 
## Process for: Boston, MA and Newark, NJ 
## [1]  train-merror:0.350309+0.002937  test-merror:0.374138+0.009638 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [46] train-merror:0.096601+0.003636  test-merror:0.253358+0.005760
## 
## [1]  train-merror:0.353347 
## [46] train-merror:0.119846 
## 
## Process for: Boston, MA and Philadelphia, PA 
## [1]  train-merror:0.341459+0.006083  test-merror:0.374608+0.015786 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [25] train-merror:0.136736+0.003403  test-merror:0.233020+0.009048
## 
## [1]  train-merror:0.342687 
## [25] train-merror:0.150240 
## 
## Process for: Boston, MA and Phoenix, AZ 
## [1]  train-merror:0.074979+0.005461  test-merror:0.087436+0.007013 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.000059+0.000118  test-merror:0.011400+0.002186
## 
## [1]  train-merror:0.070396 
## [61] train-merror:0.000235 
## 
## Process for: Boston, MA and Saint Louis, MO 
## [1]  train-merror:0.287909+0.009029  test-merror:0.307729+0.015049 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [96] train-merror:0.007828+0.000897  test-merror:0.083734+0.008483
## 
## [1]  train-merror:0.289082 
## [96] train-merror:0.011493 
## 
## Process for: Boston, MA and San Antonio, TX 
## [1]  train-merror:0.157072+0.006930  test-merror:0.179597+0.010978 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [79] train-merror:0.000030+0.000059  test-merror:0.004723+0.001056
## 
## [1]  train-merror:0.136734 
## [79] train-merror:0.000000 
## 
## Process for: Boston, MA and San Diego, CA 
## [1]  train-merror:0.101085+0.001410  test-merror:0.114371+0.006698 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.005572+0.000764  test-merror:0.023695+0.002637
## 
## [1]  train-merror:0.106158 
## [39] train-merror:0.007742 
## 
## Process for: Boston, MA and San Francisco, CA 
## [1]  train-merror:0.122131+0.001754  test-merror:0.128311+0.005670 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.007240+0.000923  test-merror:0.032019+0.006368
## 
## [1]  train-merror:0.142201 
## [43] train-merror:0.009182 
## 
## Process for: Boston, MA and San Jose, CA 
## [1]  train-merror:0.133103+0.005330  test-merror:0.144610+0.006132 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [73] train-merror:0.005723+0.001230  test-merror:0.039832+0.003721
## 
## [1]  train-merror:0.136082 
## [73] train-merror:0.007359 
## 
## Process for: Boston, MA and Seattle, WA 
## [1]  train-merror:0.177286+0.003223  test-merror:0.192383+0.017509 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [124]    train-merror:0.000000+0.000000  test-merror:0.013083+0.002495
## 
## [1]  train-merror:0.170774 
## [124]    train-merror:0.000000 
## 
## Process for: Boston, MA and Tampa Bay, FL 
## [1]  train-merror:0.061310+0.006347  test-merror:0.068964+0.005461 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [54] train-merror:0.000672+0.000117  test-merror:0.011689+0.001693
## 
## [1]  train-merror:0.074342 
## [54] train-merror:0.001286 
## 
## Process for: Boston, MA and Traverse City, MI 
## [1]  train-merror:0.210344+0.002369  test-merror:0.238641+0.002936 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [87] train-merror:0.019478+0.000341  test-merror:0.119495+0.007129
## 
## [1]  train-merror:0.213526 
## [87] train-merror:0.025114 
## 
## Process for: Boston, MA and Washington, DC 
## [1]  train-merror:0.285005+0.002330  test-merror:0.318817+0.010009 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [47] train-merror:0.066500+0.002881  test-merror:0.164524+0.009378
## 
## [1]  train-merror:0.290388 
## [47] train-merror:0.073281 
## 
## Process for: Chicago, IL and Dallas, TX 
## [1]  train-merror:0.217987+0.014725  test-merror:0.234240+0.013263 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [101]    train-merror:0.001900+0.000570  test-merror:0.050285+0.004181
## 
## [1]  train-merror:0.241960 
## [101]    train-merror:0.003041 
## 
## Process for: Chicago, IL and Denver, CO 
## [1]  train-merror:0.167956+0.006351  test-merror:0.182151+0.009266 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [69] train-merror:0.001151+0.000480  test-merror:0.030575+0.003572
## 
## [1]  train-merror:0.166686 
## [69] train-merror:0.001653 
## 
## Process for: Chicago, IL and Detroit, MI 
## [1]  train-merror:0.374637+0.003375  test-merror:0.411516+0.015771 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [80] train-merror:0.073560+0.004584  test-merror:0.284702+0.012423
## 
## [1]  train-merror:0.388249 
## [80] train-merror:0.096451 
## 
## Process for: Chicago, IL and Grand Rapids, MI 
## [1]  train-merror:0.390479+0.004770  test-merror:0.418227+0.012767 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.105111+0.004244  test-merror:0.318690+0.008544
## 
## [1]  train-merror:0.394743 
## [63] train-merror:0.122313 
## 
## Process for: Chicago, IL and Green Bay, WI 
## [1]  train-merror:0.338070+0.003979  test-merror:0.375836+0.010000 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [45] train-merror:0.097094+0.008670  test-merror:0.266961+0.013577
## 
## [1]  train-merror:0.336341 
## [45] train-merror:0.116723 
## 
## Process for: Chicago, IL and Houston, TX 
## [1]  train-merror:0.155256+0.007648  test-merror:0.164344+0.010492 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.003397+0.000782  test-merror:0.027642+0.002763
## 
## [1]  train-merror:0.166783 
## [60] train-merror:0.004530 
## 
## Process for: Chicago, IL and Indianapolis, IN 
## [1]  train-merror:0.393216+0.004694  test-merror:0.414989+0.010750 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [47] train-merror:0.114880+0.004937  test-merror:0.288920+0.012024
## 
## [1]  train-merror:0.399491 
## [47] train-merror:0.141337 
## 
## Process for: Chicago, IL and Las Vegas, NV 
## [1]  train-merror:0.068563+0.003319  test-merror:0.081409+0.007607 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [118]    train-merror:0.000000+0.000000  test-merror:0.009971+0.001172
## 
## [1]  train-merror:0.064633 
## [118]    train-merror:0.000000 
## 
## Process for: Chicago, IL and Lincoln, NE 
## [1]  train-merror:0.290850+0.003048  test-merror:0.310706+0.008702 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [115]    train-merror:0.011284+0.000237  test-merror:0.136458+0.011170
## 
## [1]  train-merror:0.286564 
## [115]    train-merror:0.015162 
## 
## Process for: Chicago, IL and Los Angeles, CA 
## [1]  train-merror:0.130477+0.002841  test-merror:0.137358+0.002716 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.008474+0.000969  test-merror:0.034605+0.001098
## 
## [1]  train-merror:0.128853 
## [53] train-merror:0.010984 
## 
## Process for: Chicago, IL and Madison, WI 
## [1]  train-merror:0.344684+0.006323  test-merror:0.371082+0.005917 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [24] train-merror:0.192179+0.003215  test-merror:0.309734+0.005818
## 
## [1]  train-merror:0.363430 
## [24] train-merror:0.208323 
## 
## Process for: Chicago, IL and Miami, FL 
## [1]  train-merror:0.051499+0.005489  test-merror:0.054373+0.006546 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [46] train-merror:0.000494+0.000174  test-merror:0.009760+0.001782
## 
## [1]  train-merror:0.071453 
## [46] train-merror:0.000465 
## 
## Process for: Chicago, IL and Milwaukee, WI 
## [1]  train-merror:0.418016+0.007356  test-merror:0.446247+0.012613 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [26] train-merror:0.211884+0.005703  test-merror:0.378758+0.010832
## 
## [1]  train-merror:0.427817 
## [26] train-merror:0.217723 
## 
## Process for: Chicago, IL and Minneapolis, MN 
## [1]  train-merror:0.377111+0.010390  test-merror:0.400256+0.009671 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.084760+0.004793  test-merror:0.233162+0.007908
## 
## [1]  train-merror:0.376880 
## [57] train-merror:0.098357 
## 
## Process for: Chicago, IL and New Orleans, LA 
## [1]  train-merror:0.153960+0.028105  test-merror:0.163928+0.030879 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [82] train-merror:0.000088+0.000072  test-merror:0.012645+0.001362
## 
## [1]  train-merror:0.144362 
## [82] train-merror:0.000117 
## 
## Process for: Chicago, IL and Newark, NJ 
## [1]  train-merror:0.367984+0.019810  test-merror:0.390314+0.019629 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [112]    train-merror:0.014235+0.001358  test-merror:0.152653+0.008111
## 
## [1]  train-merror:0.360619 
## [112]    train-merror:0.020494 
## 
## Process for: Chicago, IL and Philadelphia, PA 
## [1]  train-merror:0.347568+0.011086  test-merror:0.371098+0.003657 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [26] train-merror:0.116421+0.001638  test-merror:0.206827+0.010089
## 
## [1]  train-merror:0.358588 
## [26] train-merror:0.129194 
## 
## Process for: Chicago, IL and Phoenix, AZ 
## [1]  train-merror:0.073070+0.001655  test-merror:0.084499+0.007643 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [70] train-merror:0.000000+0.000000  test-merror:0.010342+0.001469
## 
## [1]  train-merror:0.064755 
## [70] train-merror:0.000000 
## 
## Process for: Chicago, IL and Saint Louis, MO 
## [1]  train-merror:0.342852+0.003338  test-merror:0.359563+0.011793 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [117]    train-merror:0.007242+0.000837  test-merror:0.123607+0.005465
## 
## [1]  train-merror:0.343145 
## [117]    train-merror:0.009968 
## 
## Process for: Chicago, IL and San Antonio, TX 
## [1]  train-merror:0.193146+0.007217  test-merror:0.203925+0.010282 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [101]    train-merror:0.000000+0.000000  test-merror:0.004605+0.001557
## 
## [1]  train-merror:0.192349 
## [101]    train-merror:0.000000 
## 
## Process for: Chicago, IL and San Diego, CA 
## [1]  train-merror:0.128563+0.002595  test-merror:0.137244+0.006350 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.000851+0.000376  test-merror:0.021583+0.004104
## 
## [1]  train-merror:0.122463 
## [78] train-merror:0.001056 
## 
## Process for: Chicago, IL and San Francisco, CA 
## [1]  train-merror:0.116068+0.008805  test-merror:0.128783+0.009081 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [44] train-merror:0.010447+0.001762  test-merror:0.033902+0.003081
## 
## [1]  train-merror:0.123484 
## [44] train-merror:0.011654 
## 
## Process for: Chicago, IL and San Jose, CA 
## [1]  train-merror:0.141382+0.002162  test-merror:0.147743+0.012016 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.018643+0.001921  test-merror:0.045512+0.006839
## 
## [1]  train-merror:0.136305 
## [39] train-merror:0.022523 
## 
## Process for: Chicago, IL and Seattle, WA 
## [1]  train-merror:0.181264+0.004248  test-merror:0.195939+0.008048 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [105]    train-merror:0.000175+0.000058  test-merror:0.033259+0.006238
## 
## [1]  train-merror:0.197923 
## [105]    train-merror:0.000233 
## 
## Process for: Chicago, IL and Tampa Bay, FL 
## [1]  train-merror:0.111192+0.008452  test-merror:0.119696+0.011677 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.002396+0.000695  test-merror:0.016948+0.003926
## 
## [1]  train-merror:0.113384 
## [43] train-merror:0.002922 
## 
## Process for: Chicago, IL and Traverse City, MI 
## [1]  train-merror:0.288150+0.005690  test-merror:0.315388+0.009781 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [82] train-merror:0.040784+0.002004  test-merror:0.193934+0.004959
## 
## [1]  train-merror:0.291305 
## [82] train-merror:0.051754 
## 
## Process for: Chicago, IL and Washington, DC 
## [1]  train-merror:0.332976+0.003966  test-merror:0.347490+0.005332 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.056537+0.004938  test-merror:0.146084+0.004382
## 
## [1]  train-merror:0.332263 
## [49] train-merror:0.063883 
## 
## Process for: Dallas, TX and Denver, CO 
## [1]  train-merror:0.107986+0.003798  test-merror:0.119467+0.002937 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [29] train-merror:0.001210+0.000442  test-merror:0.009444+0.001791
## 
## [1]  train-merror:0.107779 
## [29] train-merror:0.002951 
## 
## Process for: Dallas, TX and Detroit, MI 
## [1]  train-merror:0.202491+0.013372  test-merror:0.219504+0.020260 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [115]    train-merror:0.000702+0.000194  test-merror:0.051221+0.006314
## 
## [1]  train-merror:0.210853 
## [115]    train-merror:0.000936 
## 
## Process for: Dallas, TX and Grand Rapids, MI 
## [1]  train-merror:0.183896+0.008433  test-merror:0.199040+0.010508 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [64] train-merror:0.007075+0.000689  test-merror:0.046544+0.002731
## 
## [1]  train-merror:0.234125 
## [64] train-merror:0.009356 
## 
## Process for: Dallas, TX and Green Bay, WI 
## [1]  train-merror:0.170837+0.028262  test-merror:0.185862+0.034066 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [84] train-merror:0.000527+0.000354  test-merror:0.031173+0.003767
## 
## [1]  train-merror:0.164889 
## [84] train-merror:0.000703 
## 
## Process for: Dallas, TX and Houston, TX 
## [1]  train-merror:0.263360+0.006792  test-merror:0.287216+0.007596 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [112]    train-merror:0.000877+0.000160  test-merror:0.050403+0.004534
## 
## [1]  train-merror:0.277511 
## [112]    train-merror:0.001988 
## 
## Process for: Dallas, TX and Indianapolis, IN 
## [1]  train-merror:0.278242+0.008002  test-merror:0.291896+0.009107 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.016899+0.001527  test-merror:0.096830+0.005919
## 
## [1]  train-merror:0.279149 
## [72] train-merror:0.019530 
## 
## Process for: Dallas, TX and Las Vegas, NV 
## [1]  train-merror:0.071584+0.002750  test-merror:0.083872+0.005670 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.004868+0.000604  test-merror:0.036363+0.003702
## 
## [1]  train-merror:0.075660 
## [53] train-merror:0.004927 
## 
## Process for: Dallas, TX and Lincoln, NE 
## [1]  train-merror:0.244562+0.012180  test-merror:0.259267+0.005092 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [103]    train-merror:0.005818+0.000339  test-merror:0.081861+0.003628
## 
## [1]  train-merror:0.247339 
## [103]    train-merror:0.007835 
## 
## Process for: Dallas, TX and Los Angeles, CA 
## [1]  train-merror:0.110015+0.002228  test-merror:0.119878+0.009289 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [93] train-merror:0.001004+0.000461  test-merror:0.033188+0.002164
## 
## [1]  train-merror:0.112555 
## [93] train-merror:0.001299 
## 
## Process for: Dallas, TX and Madison, WI 
## [1]  train-merror:0.201716+0.007065  test-merror:0.216575+0.006577 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [56] train-merror:0.009627+0.000895  test-merror:0.051902+0.005866
## 
## [1]  train-merror:0.238938 
## [56] train-merror:0.010165 
## 
## Process for: Dallas, TX and Miami, FL 
## [1]  train-merror:0.117910+0.006164  test-merror:0.125950+0.007819 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [83] train-merror:0.000000+0.000000  test-merror:0.009823+0.000860
## 
## [1]  train-merror:0.125249 
## [83] train-merror:0.000000 
## 
## Process for: Dallas, TX and Milwaukee, WI 
## [1]  train-merror:0.181162+0.011172  test-merror:0.194014+0.011070 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.008774+0.000252  test-merror:0.053873+0.004156
## 
## [1]  train-merror:0.176643 
## [60] train-merror:0.008803 
## 
## Process for: Dallas, TX and Minneapolis, MN 
## [1]  train-merror:0.226699+0.005090  test-merror:0.240558+0.007562 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [70] train-merror:0.004970+0.000722  test-merror:0.043856+0.003598
## 
## [1]  train-merror:0.237399 
## [70] train-merror:0.007134 
## 
## Process for: Dallas, TX and New Orleans, LA 
## [1]  train-merror:0.243179+0.011326  test-merror:0.264728+0.014863 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [90] train-merror:0.000176+0.000171  test-merror:0.016743+0.002635
## 
## [1]  train-merror:0.253015 
## [90] train-merror:0.000117 
## 
## Process for: Dallas, TX and Newark, NJ 
## [1]  train-merror:0.240118+0.008488  test-merror:0.257164+0.012911 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [121]    train-merror:0.000556+0.000298  test-merror:0.041398+0.004382
## 
## [1]  train-merror:0.253538 
## [121]    train-merror:0.000819 
## 
## Process for: Dallas, TX and Philadelphia, PA 
## [1]  train-merror:0.247047+0.003779  test-merror:0.269912+0.010766 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [84] train-merror:0.003947+0.000901  test-merror:0.057186+0.003575
## 
## [1]  train-merror:0.245234 
## [84] train-merror:0.005146 
## 
## Process for: Dallas, TX and Phoenix, AZ 
## [1]  train-merror:0.090904+0.003996  test-merror:0.112587+0.003678 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [95] train-merror:0.000059+0.000072  test-merror:0.028088+0.002450
## 
## [1]  train-merror:0.097779 
## [95] train-merror:0.000118 
## 
## Process for: Dallas, TX and Saint Louis, MO 
## [1]  train-merror:0.316202+0.014311  test-merror:0.327547+0.007972 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [56] train-merror:0.051366+0.003708  test-merror:0.154804+0.008015
## 
## [1]  train-merror:0.313006 
## [56] train-merror:0.059106 
## 
## Process for: Dallas, TX and San Antonio, TX 
## [1]  train-merror:0.295076+0.011637  test-merror:0.321294+0.016596 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [90] train-merror:0.000177+0.000172  test-merror:0.026806+0.006775
## 
## [1]  train-merror:0.308537 
## [90] train-merror:0.000354 
## 
## Process for: Dallas, TX and San Diego, CA 
## [1]  train-merror:0.136950+0.010934  test-merror:0.150619+0.014981 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [83] train-merror:0.000411+0.000171  test-merror:0.021701+0.001861
## 
## [1]  train-merror:0.130440 
## [83] train-merror:0.000469 
## 
## Process for: Dallas, TX and San Francisco, CA 
## [1]  train-merror:0.128723+0.010070  test-merror:0.138317+0.009092 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [88] train-merror:0.001148+0.000235  test-merror:0.033313+0.006616
## 
## [1]  train-merror:0.121483 
## [88] train-merror:0.001177 
## 
## Process for: Dallas, TX and San Jose, CA 
## [1]  train-merror:0.100310+0.002041  test-merror:0.109811+0.007232 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [92] train-merror:0.000819+0.000271  test-merror:0.031807+0.004452
## 
## [1]  train-merror:0.104081 
## [92] train-merror:0.001286 
## 
## Process for: Dallas, TX and Seattle, WA 
## [1]  train-merror:0.140598+0.003758  test-merror:0.156824+0.011428 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [92] train-merror:0.000000+0.000000  test-merror:0.011928+0.001203
## 
## [1]  train-merror:0.147819 
## [92] train-merror:0.000000 
## 
## Process for: Dallas, TX and Tampa Bay, FL 
## [1]  train-merror:0.178751+0.005905  test-merror:0.192841+0.009359 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [123]    train-merror:0.000000+0.000000  test-merror:0.018945+0.004415
## 
## [1]  train-merror:0.180447 
## [123]    train-merror:0.000000 
## 
## Process for: Dallas, TX and Traverse City, MI 
## [1]  train-merror:0.170389+0.003778  test-merror:0.184070+0.010478 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [51] train-merror:0.005233+0.000700  test-merror:0.040813+0.006605
## 
## [1]  train-merror:0.171559 
## [51] train-merror:0.007134 
## 
## Process for: Dallas, TX and Washington, DC 
## [1]  train-merror:0.249524+0.004229  test-merror:0.271950+0.012884 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [80] train-merror:0.004401+0.000485  test-merror:0.061743+0.005710
## 
## [1]  train-merror:0.271592 
## [80] train-merror:0.004759 
## 
## Process for: Denver, CO and Detroit, MI 
## [1]  train-merror:0.168781+0.010759  test-merror:0.182862+0.011481 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [68] train-merror:0.003217+0.000539  test-merror:0.030222+0.004805
## 
## [1]  train-merror:0.180970 
## [68] train-merror:0.003423 
## 
## Process for: Denver, CO and Grand Rapids, MI 
## [1]  train-merror:0.149776+0.003786  test-merror:0.168574+0.006614 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [41] train-merror:0.008765+0.001699  test-merror:0.039310+0.004312
## 
## [1]  train-merror:0.149687 
## [41] train-merror:0.010388 
## 
## Process for: Denver, CO and Green Bay, WI 
## [1]  train-merror:0.149333+0.004944  test-merror:0.162908+0.010048 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [97] train-merror:0.000000+0.000000  test-merror:0.024790+0.004906
## 
## [1]  train-merror:0.147444 
## [97] train-merror:0.000472 
## 
## Process for: Denver, CO and Houston, TX 
## [1]  train-merror:0.030811+0.001688  test-merror:0.035060+0.004391 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [34] train-merror:0.000000+0.000000  test-merror:0.002951+0.001829
## 
## [1]  train-merror:0.035887 
## [34] train-merror:0.000000 
## 
## Process for: Denver, CO and Indianapolis, IN 
## [1]  train-merror:0.167867+0.006950  test-merror:0.180144+0.005740 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [40] train-merror:0.007998+0.000708  test-merror:0.030575+0.004262
## 
## [1]  train-merror:0.159249 
## [40] train-merror:0.010861 
## 
## Process for: Denver, CO and Las Vegas, NV 
## [1]  train-merror:0.092610+0.005027  test-merror:0.107189+0.002948 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.000000+0.000000  test-merror:0.013103+0.004277
## 
## [1]  train-merror:0.083461 
## [78] train-merror:0.000000 
## 
## Process for: Denver, CO and Lincoln, NE 
## [1]  train-merror:0.181236+0.004667  test-merror:0.193484+0.006348 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.002066+0.000447  test-merror:0.031639+0.004654
## 
## [1]  train-merror:0.180026 
## [57] train-merror:0.003305 
## 
## Process for: Denver, CO and Los Angeles, CA 
## [1]  train-merror:0.043315+0.004702  test-merror:0.050550+0.010249 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [33] train-merror:0.000000+0.000000  test-merror:0.002835+0.001513
## 
## [1]  train-merror:0.038030 
## [33] train-merror:0.000000 
## 
## Process for: Denver, CO and Madison, WI 
## [1]  train-merror:0.180519+0.005151  test-merror:0.193137+0.009339 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [46] train-merror:0.008371+0.001107  test-merror:0.036117+0.003338
## 
## [1]  train-merror:0.182731 
## [46] train-merror:0.010404 
## 
## Process for: Denver, CO and Miami, FL 
## [1]  train-merror:0.006994+0.001054  test-merror:0.010270+0.002257 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [19] train-merror:0.000000+0.000000  test-merror:0.001063+0.001016
## 
## [1]  train-merror:0.008382 
## [19] train-merror:0.000118 
## 
## Process for: Denver, CO and Milwaukee, WI 
## [1]  train-merror:0.176042+0.003374  test-merror:0.190651+0.010411 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [79] train-merror:0.000561+0.000172  test-merror:0.030221+0.004688
## 
## [1]  train-merror:0.165860 
## [79] train-merror:0.000944 
## 
## Process for: Denver, CO and Minneapolis, MN 
## [1]  train-merror:0.196523+0.018289  test-merror:0.212374+0.018550 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.001712+0.000602  test-merror:0.030930+0.004923
## 
## [1]  train-merror:0.216976 
## [67] train-merror:0.001771 
## 
## Process for: Denver, CO and New Orleans, LA 
## [1]  train-merror:0.030752+0.001391  test-merror:0.034588+0.005742 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [34] train-merror:0.000000+0.000000  test-merror:0.003069+0.000867
## 
## [1]  train-merror:0.032936 
## [34] train-merror:0.000000 
## 
## Process for: Denver, CO and Newark, NJ 
## [1]  train-merror:0.165299+0.002346  test-merror:0.178966+0.010531 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.005578+0.000411  test-merror:0.032228+0.006569
## 
## [1]  train-merror:0.166214 
## [53] train-merror:0.006965 
## 
## Process for: Denver, CO and Philadelphia, PA 
## [1]  train-merror:0.185751+0.006956  test-merror:0.207179+0.010176 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.001535+0.000274  test-merror:0.028097+0.003563
## 
## [1]  train-merror:0.201393 
## [67] train-merror:0.004250 
## 
## Process for: Denver, CO and Phoenix, AZ 
## [1]  train-merror:0.050053+0.009413  test-merror:0.063272+0.013810 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [33] train-merror:0.000030+0.000059  test-merror:0.006847+0.001093
## 
## [1]  train-merror:0.035533 
## [33] train-merror:0.000708 
## 
## Process for: Denver, CO and Saint Louis, MO 
## [1]  train-merror:0.147916+0.002517  test-merror:0.163263+0.005203 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [68] train-merror:0.001328+0.000417  test-merror:0.024318+0.002515
## 
## [1]  train-merror:0.159485 
## [68] train-merror:0.002597 
## 
## Process for: Denver, CO and San Antonio, TX 
## [1]  train-merror:0.065740+0.005107  test-merror:0.078285+0.005682 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [37] train-merror:0.000000+0.000000  test-merror:0.004014+0.001881
## 
## [1]  train-merror:0.064707 
## [37] train-merror:0.000000 
## 
## Process for: Denver, CO and San Diego, CA 
## [1]  train-merror:0.027683+0.002250  test-merror:0.035296+0.006402 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [37] train-merror:0.000000+0.000000  test-merror:0.002243+0.000441
## 
## [1]  train-merror:0.029749 
## [37] train-merror:0.000000 
## 
## Process for: Denver, CO and San Francisco, CA 
## [1]  train-merror:0.017707+0.002008  test-merror:0.022076+0.004820 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [35] train-merror:0.000000+0.000000  test-merror:0.003188+0.000289
## 
## [1]  train-merror:0.024790 
## [35] train-merror:0.000000 
## 
## Process for: Denver, CO and San Jose, CA 
## [1]  train-merror:0.034352+0.005235  test-merror:0.043677+0.009988 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [46] train-merror:0.000000+0.000000  test-merror:0.002833+0.001144
## 
## [1]  train-merror:0.026207 
## [46] train-merror:0.000000 
## 
## Process for: Denver, CO and Seattle, WA 
## [1]  train-merror:0.035149+0.002351  test-merror:0.044623+0.003390 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.000000+0.000000  test-merror:0.006846+0.001424
## 
## [1]  train-merror:0.036005 
## [57] train-merror:0.000000 
## 
## Process for: Denver, CO and Tampa Bay, FL 
## [1]  train-merror:0.021160+0.003127  test-merror:0.026561+0.002841 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.000000+0.000000  test-merror:0.001535+0.001695
## 
## [1]  train-merror:0.023492 
## [39] train-merror:0.000000 
## 
## Process for: Denver, CO and Traverse City, MI 
## [1]  train-merror:0.144699+0.001916  test-merror:0.163974+0.006560 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.002509+0.000264  test-merror:0.031402+0.003695
## 
## [1]  train-merror:0.138590 
## [66] train-merror:0.002243 
## 
## Process for: Denver, CO and Washington, DC 
## [1]  train-merror:0.162265+0.004697  test-merror:0.181062+0.009697 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.002647+0.000854  test-merror:0.022960+0.002783
## 
## [1]  train-merror:0.168927 
## [53] train-merror:0.003926 
## 
## Process for: Detroit, MI and Grand Rapids, MI 
## [1]  train-merror:0.399445+0.005661  test-merror:0.430139+0.012498 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.157652+0.008571  test-merror:0.337147+0.010418
## 
## [1]  train-merror:0.397313 
## [39] train-merror:0.164252 
## 
## Process for: Detroit, MI and Green Bay, WI 
## [1]  train-merror:0.339857+0.003300  test-merror:0.368218+0.010287 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.062873+0.004337  test-merror:0.229344+0.011766
## 
## [1]  train-merror:0.345131 
## [62] train-merror:0.084495 
## 
## Process for: Detroit, MI and Houston, TX 
## [1]  train-merror:0.156923+0.006876  test-merror:0.170332+0.011229 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [44] train-merror:0.007650+0.001226  test-merror:0.032344+0.003690
## 
## [1]  train-merror:0.168005 
## [44] train-merror:0.007330 
## 
## Process for: Detroit, MI and Indianapolis, IN 
## [1]  train-merror:0.363351+0.003140  test-merror:0.395580+0.008169 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.075160+0.006867  test-merror:0.252005+0.013993
## 
## [1]  train-merror:0.371030 
## [66] train-merror:0.087958 
## 
## Process for: Detroit, MI and Las Vegas, NV 
## [1]  train-merror:0.079120+0.001852  test-merror:0.093373+0.008240 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.000264+0.000171  test-merror:0.014545+0.002810
## 
## [1]  train-merror:0.081760 
## [49] train-merror:0.000352 
## 
## Process for: Detroit, MI and Lincoln, NE 
## [1]  train-merror:0.258893+0.004627  test-merror:0.279449+0.003852 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [124]    train-merror:0.005948+0.000955  test-merror:0.117332+0.006163
## 
## [1]  train-merror:0.250525 
## [124]    train-merror:0.009214 
## 
## Process for: Detroit, MI and Los Angeles, CA 
## [1]  train-merror:0.136885+0.008069  test-merror:0.141136+0.006727 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [77] train-merror:0.001476+0.000520  test-merror:0.025748+0.004573
## 
## [1]  train-merror:0.122475 
## [77] train-merror:0.002480 
## 
## Process for: Detroit, MI and Madison, WI 
## [1]  train-merror:0.355477+0.004528  test-merror:0.379813+0.008670 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [70] train-merror:0.074474+0.008077  test-merror:0.256636+0.015660
## 
## [1]  train-merror:0.354102 
## [70] train-merror:0.086702 
## 
## Process for: Detroit, MI and Miami, FL 
## [1]  train-merror:0.060006+0.001005  test-merror:0.067714+0.006568 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [35] train-merror:0.001571+0.000214  test-merror:0.011868+0.003246
## 
## [1]  train-merror:0.064340 
## [35] train-merror:0.002443 
## 
## Process for: Detroit, MI and Milwaukee, WI 
## [1]  train-merror:0.383627+0.012153  test-merror:0.422535+0.018885 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [76] train-merror:0.074501+0.005483  test-merror:0.275939+0.006369
## 
## [1]  train-merror:0.393662 
## [76] train-merror:0.103169 
## 
## Process for: Detroit, MI and Minneapolis, MN 
## [1]  train-merror:0.349418+0.004213  test-merror:0.373822+0.010306 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [44] train-merror:0.103199+0.005281  test-merror:0.217452+0.009311
## 
## [1]  train-merror:0.360558 
## [44] train-merror:0.111926 
## 
## Process for: Detroit, MI and New Orleans, LA 
## [1]  train-merror:0.139153+0.016189  test-merror:0.146115+0.015097 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.000615+0.000386  test-merror:0.018616+0.002199
## 
## [1]  train-merror:0.126917 
## [66] train-merror:0.000820 
## 
## Process for: Detroit, MI and Newark, NJ 
## [1]  train-merror:0.333751+0.009998  test-merror:0.350607+0.018939 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [81] train-merror:0.038105+0.003112  test-merror:0.196088+0.007107
## 
## [1]  train-merror:0.348160 
## [81] train-merror:0.045762 
## 
## Process for: Detroit, MI and Philadelphia, PA 
## [1]  train-merror:0.341224+0.015618  test-merror:0.365721+0.009883 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [91] train-merror:0.028586+0.003028  test-merror:0.183562+0.001618
## 
## [1]  train-merror:0.339179 
## [91] train-merror:0.038349 
## 
## Process for: Detroit, MI and Phoenix, AZ 
## [1]  train-merror:0.075420+0.004018  test-merror:0.088143+0.004541 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.000617+0.000110  test-merror:0.012105+0.001987
## 
## [1]  train-merror:0.064637 
## [43] train-merror:0.000353 
## 
## Process for: Detroit, MI and Saint Louis, MO 
## [1]  train-merror:0.325760+0.010096  test-merror:0.353582+0.013495 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [149]    train-merror:0.002463+0.000987  test-merror:0.115867+0.006383
## 
## [1]  train-merror:0.328486 
## [149]    train-merror:0.002463 
## 
## Process for: Detroit, MI and San Antonio, TX 
## [1]  train-merror:0.151022+0.014155  test-merror:0.170030+0.024606 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [73] train-merror:0.000000+0.000000  test-merror:0.006730+0.002548
## 
## [1]  train-merror:0.153501 
## [73] train-merror:0.000000 
## 
## Process for: Detroit, MI and San Diego, CA 
## [1]  train-merror:0.097360+0.003115  test-merror:0.108270+0.009020 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [51] train-merror:0.003285+0.000545  test-merror:0.020176+0.003332
## 
## [1]  train-merror:0.098768 
## [51] train-merror:0.004106 
## 
## Process for: Detroit, MI and San Francisco, CA 
## [1]  train-merror:0.125251+0.004625  test-merror:0.133019+0.013259 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [56] train-merror:0.005327+0.000328  test-merror:0.032137+0.001698
## 
## [1]  train-merror:0.126310 
## [56] train-merror:0.007769 
## 
## Process for: Detroit, MI and San Jose, CA 
## [1]  train-merror:0.137676+0.005300  test-merror:0.148792+0.007135 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [50] train-merror:0.014617+0.001457  test-merror:0.049013+0.007900
## 
## [1]  train-merror:0.137822 
## [50] train-merror:0.018205 
## 
## Process for: Detroit, MI and Seattle, WA 
## [1]  train-merror:0.197777+0.004062  test-merror:0.208656+0.017484 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [123]    train-merror:0.000146+0.000185  test-merror:0.032442+0.007345
## 
## [1]  train-merror:0.202941 
## [123]    train-merror:0.000000 
## 
## Process for: Detroit, MI and Tampa Bay, FL 
## [1]  train-merror:0.088135+0.003329  test-merror:0.094564+0.005384 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [38] train-merror:0.005231+0.000824  test-merror:0.018586+0.002946
## 
## [1]  train-merror:0.090240 
## [38] train-merror:0.005961 
## 
## Process for: Detroit, MI and Traverse City, MI 
## [1]  train-merror:0.333333+0.006342  test-merror:0.353811+0.010126 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.109249+0.006951  test-merror:0.242234+0.010174
## 
## [1]  train-merror:0.345084 
## [43] train-merror:0.122513 
## 
## Process for: Detroit, MI and Washington, DC 
## [1]  train-merror:0.328753+0.012910  test-merror:0.344636+0.011487 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.041667+0.001716  test-merror:0.152509+0.011638
## 
## [1]  train-merror:0.318582 
## [62] train-merror:0.049013 
## 
## Process for: Grand Rapids, MI and Green Bay, WI 
## [1]  train-merror:0.339535+0.003482  test-merror:0.365522+0.005618 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [122]    train-merror:0.025724+0.002143  test-merror:0.232978+0.009689
## 
## [1]  train-merror:0.336693 
## [122]    train-merror:0.036095 
## 
## Process for: Grand Rapids, MI and Houston, TX 
## [1]  train-merror:0.141822+0.009727  test-merror:0.154790+0.008068 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [68] train-merror:0.001577+0.000534  test-merror:0.023949+0.002638
## 
## [1]  train-merror:0.161799 
## [68] train-merror:0.002570 
## 
## Process for: Grand Rapids, MI and Indianapolis, IN 
## [1]  train-merror:0.358031+0.007471  test-merror:0.388435+0.010764 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.069363+0.002821  test-merror:0.231542+0.016598
## 
## [1]  train-merror:0.358294 
## [63] train-merror:0.073598 
## 
## Process for: Grand Rapids, MI and Las Vegas, NV 
## [1]  train-merror:0.065748+0.002298  test-merror:0.079765+0.006166 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.000088+0.000072  test-merror:0.010792+0.001421
## 
## [1]  train-merror:0.069208 
## [60] train-merror:0.000000 
## 
## Process for: Grand Rapids, MI and Lincoln, NE 
## [1]  train-merror:0.266181+0.007568  test-merror:0.290305+0.011059 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [130]    train-merror:0.007594+0.001681  test-merror:0.129790+0.004600
## 
## [1]  train-merror:0.266122 
## [130]    train-merror:0.007944 
## 
## Process for: Grand Rapids, MI and Los Angeles, CA 
## [1]  train-merror:0.150378+0.003243  test-merror:0.159680+0.007570 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [54] train-merror:0.009655+0.000616  test-merror:0.034368+0.004149
## 
## [1]  train-merror:0.147514 
## [54] train-merror:0.010630 
## 
## Process for: Grand Rapids, MI and Madison, WI 
## [1]  train-merror:0.365702+0.003756  test-merror:0.386505+0.025197 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [48] train-merror:0.121143+0.010394  test-merror:0.290719+0.006485
## 
## [1]  train-merror:0.373834 
## [48] train-merror:0.133222 
## 
## Process for: Grand Rapids, MI and Miami, FL 
## [1]  train-merror:0.037120+0.000607  test-merror:0.038668+0.000940 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [36] train-merror:0.000759+0.000298  test-merror:0.008060+0.002068
## 
## [1]  train-merror:0.035748 
## [36] train-merror:0.001051 
## 
## Process for: Grand Rapids, MI and Milwaukee, WI 
## [1]  train-merror:0.402700+0.007814  test-merror:0.419719+0.007086 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [79] train-merror:0.083832+0.006330  test-merror:0.310914+0.007989
## 
## [1]  train-merror:0.411385 
## [79] train-merror:0.104812 
## 
## Process for: Grand Rapids, MI and Minneapolis, MN 
## [1]  train-merror:0.374738+0.006782  test-merror:0.398483+0.011499 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [91] train-merror:0.052015+0.003400  test-merror:0.230605+0.008535
## 
## [1]  train-merror:0.358061 
## [91] train-merror:0.065187 
## 
## Process for: Grand Rapids, MI and New Orleans, LA 
## [1]  train-merror:0.112370+0.005429  test-merror:0.117786+0.005191 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [50] train-merror:0.001259+0.000388  test-merror:0.016860+0.002205
## 
## [1]  train-merror:0.118019 
## [50] train-merror:0.001171 
## 
## Process for: Grand Rapids, MI and Newark, NJ 
## [1]  train-merror:0.330900+0.012575  test-merror:0.352217+0.009279 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.033616+0.002233  test-merror:0.159348+0.010300
## 
## [1]  train-merror:0.317290 
## [71] train-merror:0.042407 
## 
## Process for: Grand Rapids, MI and Philadelphia, PA 
## [1]  train-merror:0.331258+0.008639  test-merror:0.350401+0.015466 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [85] train-merror:0.025371+0.002042  test-merror:0.148720+0.008317
## 
## [1]  train-merror:0.345142 
## [85] train-merror:0.031217 
## 
## Process for: Grand Rapids, MI and Phoenix, AZ 
## [1]  train-merror:0.078446+0.006503  test-merror:0.083207+0.010047 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [48] train-merror:0.000118+0.000110  test-merror:0.009989+0.001532
## 
## [1]  train-merror:0.063697 
## [48] train-merror:0.000000 
## 
## Process for: Grand Rapids, MI and Saint Louis, MO 
## [1]  train-merror:0.326199+0.012463  test-merror:0.348778+0.020815 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [149]    train-merror:0.001525+0.000220  test-merror:0.109298+0.004480
## 
## [1]  train-merror:0.322153 
## [149]    train-merror:0.002228 
## 
## Process for: Grand Rapids, MI and San Antonio, TX 
## [1]  train-merror:0.156158+0.008303  test-merror:0.161881+0.011273 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [64] train-merror:0.000059+0.000073  test-merror:0.009092+0.001521
## 
## [1]  train-merror:0.196836 
## [64] train-merror:0.000118 
## 
## Process for: Grand Rapids, MI and San Diego, CA 
## [1]  train-merror:0.102669+0.005395  test-merror:0.114134+0.009285 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [48] train-merror:0.005601+0.000925  test-merror:0.022874+0.001856
## 
## [1]  train-merror:0.096188 
## [48] train-merror:0.006334 
## 
## Process for: Grand Rapids, MI and San Francisco, CA 
## [1]  train-merror:0.129311+0.007524  test-merror:0.133961+0.006990 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [47] train-merror:0.008358+0.001204  test-merror:0.034726+0.006195
## 
## [1]  train-merror:0.130312 
## [47] train-merror:0.010830 
## 
## Process for: Grand Rapids, MI and San Jose, CA 
## [1]  train-merror:0.148774+0.006363  test-merror:0.159110+0.008890 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.006717+0.001552  test-merror:0.041239+0.005479
## 
## [1]  train-merror:0.156659 
## [72] train-merror:0.007009 
## 
## Process for: Grand Rapids, MI and Seattle, WA 
## [1]  train-merror:0.213405+0.001218  test-merror:0.229442+0.009256 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [116]    train-merror:0.000146+0.000092  test-merror:0.037033+0.004478
## 
## [1]  train-merror:0.200935 
## [116]    train-merror:0.000701 
## 
## Process for: Grand Rapids, MI and Tampa Bay, FL 
## [1]  train-merror:0.078434+0.007339  test-merror:0.082408+0.004015 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [41] train-merror:0.002104+0.000353  test-merror:0.014144+0.001451
## 
## [1]  train-merror:0.088954 
## [41] train-merror:0.002688 
## 
## Process for: Grand Rapids, MI and Traverse City, MI 
## [1]  train-merror:0.305753+0.006067  test-merror:0.333530+0.011058 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.114895+0.003078  test-merror:0.252458+0.016959
## 
## [1]  train-merror:0.301051 
## [43] train-merror:0.127687 
## 
## Process for: Grand Rapids, MI and Washington, DC 
## [1]  train-merror:0.277629+0.006442  test-merror:0.296099+0.018309 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [86] train-merror:0.019153+0.000318  test-merror:0.109922+0.006145
## 
## [1]  train-merror:0.281109 
## [86] train-merror:0.026648 
## 
## Process for: Green Bay, WI and Houston, TX 
## [1]  train-merror:0.111274+0.005654  test-merror:0.120004+0.008853 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.002695+0.000199  test-merror:0.022618+0.005220
## 
## [1]  train-merror:0.117778 
## [49] train-merror:0.003281 
## 
## Process for: Green Bay, WI and Indianapolis, IN 
## [1]  train-merror:0.340238+0.010265  test-merror:0.362589+0.014677 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [164]    train-merror:0.001289+0.000326  test-merror:0.140631+0.006711
## 
## [1]  train-merror:0.332826 
## [164]    train-merror:0.003281 
## 
## Process for: Green Bay, WI and Las Vegas, NV 
## [1]  train-merror:0.043812+0.003178  test-merror:0.054781+0.005521 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.000000+0.000000  test-merror:0.009268+0.002838
## 
## [1]  train-merror:0.051496 
## [57] train-merror:0.000000 
## 
## Process for: Green Bay, WI and Lincoln, NE 
## [1]  train-merror:0.313166+0.004794  test-merror:0.327432+0.009535 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [145]    train-merror:0.002666+0.000841  test-merror:0.119181+0.006889
## 
## [1]  train-merror:0.318294 
## [145]    train-merror:0.005508 
## 
## Process for: Green Bay, WI and Los Angeles, CA 
## [1]  train-merror:0.125132+0.006374  test-merror:0.131217+0.006789 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.009271+0.000931  test-merror:0.027401+0.003416
## 
## [1]  train-merror:0.129680 
## [39] train-merror:0.010039 
## 
## Process for: Green Bay, WI and Madison, WI 
## [1]  train-merror:0.363310+0.006002  test-merror:0.390817+0.009574 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.131069+0.003047  test-merror:0.289165+0.003681
## 
## [1]  train-merror:0.359244 
## [39] train-merror:0.148529 
## 
## Process for: Green Bay, WI and Miami, FL 
## [1]  train-merror:0.038673+0.002295  test-merror:0.042893+0.002914 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.000117+0.000144  test-merror:0.007500+0.001633
## 
## [1]  train-merror:0.037384 
## [43] train-merror:0.000586 
## 
## Process for: Green Bay, WI and Milwaukee, WI 
## [1]  train-merror:0.347388+0.005717  test-merror:0.389908+0.008927 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [70] train-merror:0.076878+0.003359  test-merror:0.277815+0.007097
## 
## [1]  train-merror:0.344014 
## [70] train-merror:0.090493 
## 
## Process for: Green Bay, WI and Minneapolis, MN 
## [1]  train-merror:0.351372+0.009089  test-merror:0.375955+0.013468 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [96] train-merror:0.030968+0.002236  test-merror:0.207667+0.008010
## 
## [1]  train-merror:0.358139 
## [96] train-merror:0.037853 
## 
## Process for: Green Bay, WI and New Orleans, LA 
## [1]  train-merror:0.120151+0.004340  test-merror:0.126450+0.015101 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [38] train-merror:0.002490+0.000278  test-merror:0.014649+0.002158
## 
## [1]  train-merror:0.124575 
## [38] train-merror:0.002813 
## 
## Process for: Green Bay, WI and Newark, NJ 
## [1]  train-merror:0.250469+0.004185  test-merror:0.268607+0.009600 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [111]    train-merror:0.004775+0.000896  test-merror:0.098910+0.005374
## 
## [1]  train-merror:0.262159 
## [111]    train-merror:0.005508 
## 
## Process for: Green Bay, WI and Philadelphia, PA 
## [1]  train-merror:0.248828+0.003344  test-merror:0.272356+0.005347 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [87] train-merror:0.012305+0.000823  test-merror:0.104184+0.006329
## 
## [1]  train-merror:0.260635 
## [87] train-merror:0.013711 
## 
## Process for: Green Bay, WI and Phoenix, AZ 
## [1]  train-merror:0.048360+0.000817  test-merror:0.054413+0.003911 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [64] train-merror:0.000000+0.000000  test-merror:0.005054+0.001469
## 
## [1]  train-merror:0.043601 
## [64] train-merror:0.000000 
## 
## Process for: Green Bay, WI and Saint Louis, MO 
## [1]  train-merror:0.254105+0.003297  test-merror:0.276767+0.007213 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [76] train-merror:0.008180+0.000826  test-merror:0.089011+0.007456
## 
## [1]  train-merror:0.261053 
## [76] train-merror:0.012548 
## 
## Process for: Green Bay, WI and San Antonio, TX 
## [1]  train-merror:0.138446+0.005777  test-merror:0.147951+0.009159 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [42] train-merror:0.000354+0.000150  test-merror:0.007912+0.002198
## 
## [1]  train-merror:0.146298 
## [42] train-merror:0.000590 
## 
## Process for: Green Bay, WI and San Diego, CA 
## [1]  train-merror:0.085865+0.003751  test-merror:0.091845+0.005078 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.000704+0.000387  test-merror:0.015133+0.002557
## 
## [1]  train-merror:0.090088 
## [63] train-merror:0.000469 
## 
## Process for: Green Bay, WI and San Francisco, CA 
## [1]  train-merror:0.120188+0.007668  test-merror:0.126899+0.006176 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [80] train-merror:0.001030+0.000208  test-merror:0.024484+0.005462
## 
## [1]  train-merror:0.131725 
## [80] train-merror:0.001059 
## 
## Process for: Green Bay, WI and San Jose, CA 
## [1]  train-merror:0.112123+0.005984  test-merror:0.122348+0.005693 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [96] train-merror:0.000644+0.000199  test-merror:0.027070+0.002873
## 
## [1]  train-merror:0.109223 
## [96] train-merror:0.001055 
## 
## Process for: Green Bay, WI and Seattle, WA 
## [1]  train-merror:0.158385+0.004253  test-merror:0.170165+0.011760 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [91] train-merror:0.003076+0.000382  test-merror:0.054610+0.003653
## 
## [1]  train-merror:0.158209 
## [91] train-merror:0.004336 
## 
## Process for: Green Bay, WI and Tampa Bay, FL 
## [1]  train-merror:0.083822+0.010343  test-merror:0.087542+0.005167 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [37] train-merror:0.003662+0.000540  test-merror:0.014766+0.002998
## 
## [1]  train-merror:0.087074 
## [37] train-merror:0.004336 
## 
## Process for: Green Bay, WI and Traverse City, MI 
## [1]  train-merror:0.306252+0.006467  test-merror:0.332592+0.012091 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [48] train-merror:0.075853+0.002026  test-merror:0.202273+0.005068
## 
## [1]  train-merror:0.310207 
## [48] train-merror:0.086136 
## 
## Process for: Green Bay, WI and Washington, DC 
## [1]  train-merror:0.273644+0.002510  test-merror:0.291100+0.010395 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [99] train-merror:0.008000+0.000875  test-merror:0.089222+0.005815
## 
## [1]  train-merror:0.294789 
## [99] train-merror:0.011896 
## 
## Process for: Houston, TX and Indianapolis, IN 
## [1]  train-merror:0.155691+0.012125  test-merror:0.167828+0.018419 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.008798+0.000987  test-merror:0.043554+0.002544
## 
## [1]  train-merror:0.150174 
## [49] train-merror:0.013937 
## 
## Process for: Houston, TX and Las Vegas, NV 
## [1]  train-merror:0.037420+0.001813  test-merror:0.049500+0.004662 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.000059+0.000072  test-merror:0.014076+0.001851
## 
## [1]  train-merror:0.042698 
## [60] train-merror:0.000117 
## 
## Process for: Houston, TX and Lincoln, NE 
## [1]  train-merror:0.132319+0.008311  test-merror:0.146371+0.011532 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.001895+0.000391  test-merror:0.024960+0.004631
## 
## [1]  train-merror:0.144273 
## [60] train-merror:0.002916 
## 
## Process for: Houston, TX and Los Angeles, CA 
## [1]  train-merror:0.109366+0.004230  test-merror:0.122476+0.008020 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.006082+0.000286  test-merror:0.037439+0.001610
## 
## [1]  train-merror:0.109956 
## [61] train-merror:0.007677 
## 
## Process for: Houston, TX and Madison, WI 
## [1]  train-merror:0.148559+0.015941  test-merror:0.154628+0.012812 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [34] train-merror:0.009776+0.001479  test-merror:0.029299+0.005645
## 
## [1]  train-merror:0.128917 
## [34] train-merror:0.010643 
## 
## Process for: Houston, TX and Miami, FL 
## [1]  train-merror:0.216133+0.007978  test-merror:0.235969+0.006042 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [80] train-merror:0.027681+0.001957  test-merror:0.108633+0.006741
## 
## [1]  train-merror:0.209597 
## [80] train-merror:0.032183 
## 
## Process for: Houston, TX and Milwaukee, WI 
## [1]  train-merror:0.133392+0.014233  test-merror:0.142723+0.015957 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [52] train-merror:0.002670+0.000495  test-merror:0.022535+0.002717
## 
## [1]  train-merror:0.138615 
## [52] train-merror:0.003286 
## 
## Process for: Houston, TX and Minneapolis, MN 
## [1]  train-merror:0.120267+0.005896  test-merror:0.132866+0.017606 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [40] train-merror:0.003456+0.000749  test-merror:0.016957+0.001003
## 
## [1]  train-merror:0.131707 
## [40] train-merror:0.003600 
## 
## Process for: Houston, TX and New Orleans, LA 
## [1]  train-merror:0.359327+0.015479  test-merror:0.363186+0.011896 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [85] train-merror:0.010772+0.001686  test-merror:0.077394+0.006806
## 
## [1]  train-merror:0.349374 
## [85] train-merror:0.011006 
## 
## Process for: Houston, TX and Newark, NJ 
## [1]  train-merror:0.150385+0.005301  test-merror:0.163019+0.006164 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [36] train-merror:0.013944+0.001030  test-merror:0.046227+0.003217
## 
## [1]  train-merror:0.142175 
## [36] train-merror:0.017466 
## 
## Process for: Houston, TX and Philadelphia, PA 
## [1]  train-merror:0.199053+0.010965  test-merror:0.213376+0.008521 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [45] train-merror:0.013037+0.000719  test-merror:0.051327+0.004783
## 
## [1]  train-merror:0.154799 
## [45] train-merror:0.016485 
## 
## Process for: Houston, TX and Phoenix, AZ 
## [1]  train-merror:0.051945+0.001301  test-merror:0.064049+0.003579 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [73] train-merror:0.000000+0.000000  test-merror:0.015043+0.002121
## 
## [1]  train-merror:0.061817 
## [73] train-merror:0.000000 
## 
## Process for: Houston, TX and Saint Louis, MO 
## [1]  train-merror:0.180632+0.020620  test-merror:0.195145+0.015070 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.000938+0.000410  test-merror:0.026973+0.001660
## 
## [1]  train-merror:0.180016 
## [75] train-merror:0.000704 
## 
## Process for: Houston, TX and San Antonio, TX 
## [1]  train-merror:0.227093+0.002070  test-merror:0.235447+0.007773 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [77] train-merror:0.000000+0.000000  test-merror:0.005431+0.002402
## 
## [1]  train-merror:0.236982 
## [77] train-merror:0.000000 
## 
## Process for: Houston, TX and San Diego, CA 
## [1]  train-merror:0.123314+0.004544  test-merror:0.135015+0.006009 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [52] train-merror:0.008035+0.000978  test-merror:0.033785+0.004477
## 
## [1]  train-merror:0.135484 
## [52] train-merror:0.009032 
## 
## Process for: Houston, TX and San Francisco, CA 
## [1]  train-merror:0.110800+0.002615  test-merror:0.119720+0.007466 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.007799+0.000336  test-merror:0.043321+0.005537
## 
## [1]  train-merror:0.105474 
## [61] train-merror:0.009417 
## 
## Process for: Houston, TX and San Jose, CA 
## [1]  train-merror:0.099953+0.001314  test-merror:0.110864+0.007755 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.014004+0.001687  test-merror:0.046213+0.003198
## 
## [1]  train-merror:0.091493 
## [49] train-merror:0.016805 
## 
## Process for: Houston, TX and Seattle, WA 
## [1]  train-merror:0.102929+0.003889  test-merror:0.110164+0.005353 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.000175+0.000143  test-merror:0.015638+0.002872
## 
## [1]  train-merror:0.111332 
## [75] train-merror:0.000233 
## 
## Process for: Houston, TX and Tampa Bay, FL 
## [1]  train-merror:0.278024+0.005783  test-merror:0.291876+0.011540 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.058971+0.002679  test-merror:0.157452+0.009902
## 
## [1]  train-merror:0.280771 
## [57] train-merror:0.067563 
## 
## Process for: Houston, TX and Traverse City, MI 
## [1]  train-merror:0.109437+0.008076  test-merror:0.117188+0.007800 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [29] train-merror:0.005546+0.000652  test-merror:0.020325+0.004299
## 
## [1]  train-merror:0.118583 
## [29] train-merror:0.005923 
## 
## Process for: Houston, TX and Washington, DC 
## [1]  train-merror:0.197597+0.004558  test-merror:0.212467+0.008716 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [65] train-merror:0.015882+0.001172  test-merror:0.071496+0.002349
## 
## [1]  train-merror:0.192719 
## [65] train-merror:0.019510 
## 
## Process for: Indianapolis, IN and Las Vegas, NV 
## [1]  train-merror:0.066745+0.003355  test-merror:0.078474+0.007477 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.000118+0.000059  test-merror:0.019003+0.002479
## 
## [1]  train-merror:0.069443 
## [53] train-merror:0.000352 
## 
## Process for: Indianapolis, IN and Lincoln, NE 
## [1]  train-merror:0.286681+0.002637  test-merror:0.307440+0.017400 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [79] train-merror:0.038517+0.003541  test-merror:0.175417+0.019248
## 
## [1]  train-merror:0.281549 
## [79] train-merror:0.043387 
## 
## Process for: Indianapolis, IN and Los Angeles, CA 
## [1]  train-merror:0.132337+0.004923  test-merror:0.144087+0.009502 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.005226+0.001399  test-merror:0.032007+0.003312
## 
## [1]  train-merror:0.133459 
## [60] train-merror:0.005905 
## 
## Process for: Indianapolis, IN and Madison, WI 
## [1]  train-merror:0.321514+0.003624  test-merror:0.354100+0.011784 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [32] train-merror:0.116150+0.002867  test-merror:0.221836+0.013185
## 
## [1]  train-merror:0.327792 
## [32] train-merror:0.123894 
## 
## Process for: Indianapolis, IN and Miami, FL 
## [1]  train-merror:0.086615+0.002512  test-merror:0.099220+0.007612 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [69] train-merror:0.000000+0.000000  test-merror:0.011153+0.001884
## 
## [1]  train-merror:0.083885 
## [69] train-merror:0.000000 
## 
## Process for: Indianapolis, IN and Milwaukee, WI 
## [1]  train-merror:0.388791+0.011443  test-merror:0.410798+0.015380 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [20] train-merror:0.173093+0.009917  test-merror:0.257394+0.005002
## 
## [1]  train-merror:0.369484 
## [20] train-merror:0.184038 
## 
## Process for: Indianapolis, IN and Minneapolis, MN 
## [1]  train-merror:0.357613+0.007581  test-merror:0.375490+0.010113 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [59] train-merror:0.060923+0.002313  test-merror:0.169172+0.004885
## 
## [1]  train-merror:0.361490 
## [59] train-merror:0.064568 
## 
## Process for: Indianapolis, IN and New Orleans, LA 
## [1]  train-merror:0.176267+0.021576  test-merror:0.190142+0.023796 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [68] train-merror:0.000351+0.000149  test-merror:0.020839+0.006467
## 
## [1]  train-merror:0.142723 
## [68] train-merror:0.000702 
## 
## Process for: Indianapolis, IN and Newark, NJ 
## [1]  train-merror:0.370954+0.007306  test-merror:0.381344+0.015576 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [101]    train-merror:0.014148+0.002646  test-merror:0.138683+0.014157
## 
## [1]  train-merror:0.369702 
## [101]    train-merror:0.018631 
## 
## Process for: Indianapolis, IN and Philadelphia, PA 
## [1]  train-merror:0.331696+0.005457  test-merror:0.354965+0.010607 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [102]    train-merror:0.013153+0.001538  test-merror:0.131883+0.011289
## 
## [1]  train-merror:0.344207 
## [102]    train-merror:0.013095 
## 
## Process for: Indianapolis, IN and Phoenix, AZ 
## [1]  train-merror:0.076889+0.006430  test-merror:0.093316+0.007722 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.000000+0.000000  test-merror:0.010930+0.001092
## 
## [1]  train-merror:0.081678 
## [75] train-merror:0.000000 
## 
## Process for: Indianapolis, IN and Saint Louis, MO 
## [1]  train-merror:0.374604+0.010113  test-merror:0.410807+0.015161 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [149]    train-merror:0.008795+0.001352  test-merror:0.191741+0.008450
## 
## [1]  train-merror:0.383957 
## [149]    train-merror:0.013252 
## 
## Process for: Indianapolis, IN and San Antonio, TX 
## [1]  train-merror:0.204629+0.013450  test-merror:0.217855+0.020185 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [97] train-merror:0.000000+0.000000  test-merror:0.006849+0.001777
## 
## [1]  train-merror:0.210060 
## [97] train-merror:0.000000 
## 
## Process for: Indianapolis, IN and San Diego, CA 
## [1]  train-merror:0.125161+0.015049  test-merror:0.136422+0.015252 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.001877+0.000589  test-merror:0.023578+0.003661
## 
## [1]  train-merror:0.122346 
## [63] train-merror:0.002229 
## 
## Process for: Indianapolis, IN and San Francisco, CA 
## [1]  train-merror:0.120689+0.003801  test-merror:0.127134+0.007853 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.005268+0.000792  test-merror:0.041672+0.005335
## 
## [1]  train-merror:0.118305 
## [62] train-merror:0.007298 
## 
## Process for: Indianapolis, IN and San Jose, CA 
## [1]  train-merror:0.140156+0.001082  test-merror:0.145874+0.007613 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.006623+0.000935  test-merror:0.041195+0.001544
## 
## [1]  train-merror:0.140040 
## [61] train-merror:0.008402 
## 
## Process for: Indianapolis, IN and Seattle, WA 
## [1]  train-merror:0.188207+0.003392  test-merror:0.200024+0.010921 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [104]    train-merror:0.000175+0.000143  test-merror:0.030341+0.004620
## 
## [1]  train-merror:0.188820 
## [104]    train-merror:0.000350 
## 
## Process for: Indianapolis, IN and Tampa Bay, FL 
## [1]  train-merror:0.128112+0.005324  test-merror:0.138047+0.006624 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [41] train-merror:0.005377+0.000730  test-merror:0.026534+0.005513
## 
## [1]  train-merror:0.133840 
## [41] train-merror:0.006546 
## 
## Process for: Indianapolis, IN and Traverse City, MI 
## [1]  train-merror:0.271650+0.015169  test-merror:0.295128+0.015836 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [112]    train-merror:0.010854+0.001327  test-merror:0.123421+0.008222
## 
## [1]  train-merror:0.264791 
## [112]    train-merror:0.015283 
## 
## Process for: Indianapolis, IN and Washington, DC 
## [1]  train-merror:0.311861+0.011918  test-merror:0.329762+0.012334 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [88] train-merror:0.019153+0.001850  test-merror:0.129670+0.005600
## 
## [1]  train-merror:0.315251 
## [88] train-merror:0.022008 
## 
## Process for: Las Vegas, NV and Lincoln, NE 
## [1]  train-merror:0.075190+0.001764  test-merror:0.091378+0.003038 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [58] train-merror:0.000411+0.000110  test-merror:0.018887+0.003160
## 
## [1]  train-merror:0.080235 
## [58] train-merror:0.000704 
## 
## Process for: Las Vegas, NV and Los Angeles, CA 
## [1]  train-merror:0.081641+0.000734  test-merror:0.097672+0.006247 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.000266+0.000172  test-merror:0.019605+0.003566
## 
## [1]  train-merror:0.078422 
## [53] train-merror:0.000472 
## 
## Process for: Las Vegas, NV and Madison, WI 
## [1]  train-merror:0.066551+0.002143  test-merror:0.077254+0.002177 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.000000+0.000000  test-merror:0.011959+0.002238
## 
## [1]  train-merror:0.067687 
## [63] train-merror:0.000000 
## 
## Process for: Las Vegas, NV and Miami, FL 
## [1]  train-merror:0.006452+0.001261  test-merror:0.014311+0.003452 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [33] train-merror:0.000000+0.000000  test-merror:0.004458+0.002118
## 
## [1]  train-merror:0.008563 
## [33] train-merror:0.000000 
## 
## Process for: Las Vegas, NV and Milwaukee, WI 
## [1]  train-merror:0.059536+0.002590  test-merror:0.070070+0.006890 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [65] train-merror:0.000000+0.000000  test-merror:0.010211+0.002692
## 
## [1]  train-merror:0.055869 
## [65] train-merror:0.000000 
## 
## Process for: Las Vegas, NV and Minneapolis, MN 
## [1]  train-merror:0.078065+0.009940  test-merror:0.091848+0.014942 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [91] train-merror:0.000000+0.000000  test-merror:0.007390+0.001949
## 
## [1]  train-merror:0.077771 
## [91] train-merror:0.000000 
## 
## Process for: Las Vegas, NV and New Orleans, LA 
## [1]  train-merror:0.030058+0.001572  test-merror:0.038594+0.008301 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [38] train-merror:0.000792+0.000199  test-merror:0.012434+0.002711
## 
## [1]  train-merror:0.028974 
## [38] train-merror:0.001408 
## 
## Process for: Las Vegas, NV and Newark, NJ 
## [1]  train-merror:0.093460+0.002494  test-merror:0.108971+0.006803 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [101]    train-merror:0.000000+0.000000  test-merror:0.019355+0.002547
## 
## [1]  train-merror:0.098065 
## [101]    train-merror:0.000000 
## 
## Process for: Las Vegas, NV and Philadelphia, PA 
## [1]  train-merror:0.090557+0.003454  test-merror:0.112023+0.007159 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.000850+0.000339  test-merror:0.026978+0.003918
## 
## [1]  train-merror:0.089150 
## [53] train-merror:0.001877 
## 
## Process for: Las Vegas, NV and Phoenix, AZ 
## [1]  train-merror:0.228493+0.006733  test-merror:0.244446+0.008517 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.031437+0.002314  test-merror:0.107651+0.003762
## 
## [1]  train-merror:0.241039 
## [57] train-merror:0.036785 
## 
## Process for: Las Vegas, NV and Saint Louis, MO 
## [1]  train-merror:0.081847+0.003955  test-merror:0.096774+0.009362 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.000968+0.000302  test-merror:0.026862+0.002444
## 
## [1]  train-merror:0.091026 
## [62] train-merror:0.001525 
## 
## Process for: Las Vegas, NV and San Antonio, TX 
## [1]  train-merror:0.059364+0.002852  test-merror:0.072973+0.006618 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [34] train-merror:0.007321+0.000879  test-merror:0.027040+0.001955
## 
## [1]  train-merror:0.060220 
## [34] train-merror:0.007321 
## 
## Process for: Las Vegas, NV and San Diego, CA 
## [1]  train-merror:0.055660+0.002392  test-merror:0.066863+0.004359 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.000029+0.000059  test-merror:0.010910+0.002641
## 
## [1]  train-merror:0.055132 
## [72] train-merror:0.000117 
## 
## Process for: Las Vegas, NV and San Francisco, CA 
## [1]  train-merror:0.031842+0.001220  test-merror:0.041789+0.003443 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [49] train-merror:0.000206+0.000072  test-merror:0.011771+0.001533
## 
## [1]  train-merror:0.035079 
## [49] train-merror:0.000118 
## 
## Process for: Las Vegas, NV and San Jose, CA 
## [1]  train-merror:0.054985+0.002104  test-merror:0.069560+0.010600 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.000029+0.000059  test-merror:0.013138+0.002716
## 
## [1]  train-merror:0.055484 
## [53] train-merror:0.000117 
## 
## Process for: Las Vegas, NV and Seattle, WA 
## [1]  train-merror:0.051261+0.001889  test-merror:0.063697+0.007734 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [47] train-merror:0.000059+0.000072  test-merror:0.009150+0.001211
## 
## [1]  train-merror:0.051848 
## [47] train-merror:0.000000 
## 
## Process for: Las Vegas, NV and Tampa Bay, FL 
## [1]  train-merror:0.019238+0.000653  test-merror:0.030265+0.004974 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.000000+0.000000  test-merror:0.007273+0.001840
## 
## [1]  train-merror:0.019120 
## [57] train-merror:0.000000 
## 
## Process for: Las Vegas, NV and Traverse City, MI 
## [1]  train-merror:0.069120+0.004760  test-merror:0.079883+0.007225 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [58] train-merror:0.000059+0.000072  test-merror:0.010557+0.002256
## 
## [1]  train-merror:0.064751 
## [58] train-merror:0.000117 
## 
## Process for: Las Vegas, NV and Washington, DC 
## [1]  train-merror:0.085534+0.003280  test-merror:0.103264+0.011943 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.000565+0.000173  test-merror:0.027004+0.002611
## 
## [1]  train-merror:0.092196 
## [71] train-merror:0.000952 
## 
## Process for: Lincoln, NE and Los Angeles, CA 
## [1]  train-merror:0.109513+0.006465  test-merror:0.114919+0.012652 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [88] train-merror:0.000030+0.000059  test-merror:0.016180+0.003393
## 
## [1]  train-merror:0.107476 
## [88] train-merror:0.000000 
## 
## Process for: Lincoln, NE and Madison, WI 
## [1]  train-merror:0.280974+0.001787  test-merror:0.302799+0.006240 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [111]    train-merror:0.012258+0.001326  test-merror:0.152835+0.004937
## 
## [1]  train-merror:0.284980 
## [111]    train-merror:0.019254 
## 
## Process for: Lincoln, NE and Miami, FL 
## [1]  train-merror:0.050298+0.002260  test-merror:0.060997+0.009784 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [45] train-merror:0.000088+0.000117  test-merror:0.010263+0.002319
## 
## [1]  train-merror:0.052834 
## [45] train-merror:0.000350 
## 
## Process for: Lincoln, NE and Milwaukee, WI 
## [1]  train-merror:0.287559+0.007243  test-merror:0.300120+0.011291 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [90] train-merror:0.018955+0.002688  test-merror:0.138851+0.007741
## 
## [1]  train-merror:0.289202 
## [90] train-merror:0.024178 
## 
## Process for: Lincoln, NE and Minneapolis, MN 
## [1]  train-merror:0.296974+0.003372  test-merror:0.317003+0.008940 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [116]    train-merror:0.010526+0.001262  test-merror:0.139260+0.008548
## 
## [1]  train-merror:0.301376 
## [116]    train-merror:0.017845 
## 
## Process for: Lincoln, NE and New Orleans, LA 
## [1]  train-merror:0.136254+0.008289  test-merror:0.145299+0.006462 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.000117+0.000110  test-merror:0.012762+0.002324
## 
## [1]  train-merror:0.137103 
## [61] train-merror:0.000702 
## 
## Process for: Lincoln, NE and Newark, NJ 
## [1]  train-merror:0.239561+0.001862  test-merror:0.258106+0.011395 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [135]    train-merror:0.001079+0.000376  test-merror:0.087590+0.004503
## 
## [1]  train-merror:0.241311 
## [135]    train-merror:0.002683 
## 
## Process for: Lincoln, NE and Philadelphia, PA 
## [1]  train-merror:0.239888+0.015410  test-merror:0.260143+0.014248 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [116]    train-merror:0.003332+0.000518  test-merror:0.096575+0.002868
## 
## [1]  train-merror:0.256869 
## [116]    train-merror:0.005495 
## 
## Process for: Lincoln, NE and Phoenix, AZ 
## [1]  train-merror:0.097162+0.004407  test-merror:0.102950+0.004620 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [74] train-merror:0.000059+0.000072  test-merror:0.017158+0.002389
## 
## [1]  train-merror:0.086967 
## [74] train-merror:0.000235 
## 
## Process for: Lincoln, NE and Saint Louis, MO 
## [1]  train-merror:0.290870+0.005826  test-merror:0.324734+0.009600 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [24] train-merror:0.117010+0.003871  test-merror:0.205818+0.008006
## 
## [1]  train-merror:0.292248 
## [24] train-merror:0.131934 
## 
## Process for: Lincoln, NE and San Antonio, TX 
## [1]  train-merror:0.177795+0.003965  test-merror:0.193412+0.012173 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [150]    train-merror:0.000000+0.000000  test-merror:0.017240+0.002349
## 
## [1]  train-merror:0.181603 
## [150]    train-merror:0.000000 
## 
## Process for: Lincoln, NE and San Diego, CA 
## [1]  train-merror:0.084868+0.002681  test-merror:0.093490+0.006965 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [98] train-merror:0.000000+0.000000  test-merror:0.010557+0.001700
## 
## [1]  train-merror:0.086334 
## [98] train-merror:0.000000 
## 
## Process for: Lincoln, NE and San Francisco, CA 
## [1]  train-merror:0.117098+0.007345  test-merror:0.127252+0.007523 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [40] train-merror:0.007034+0.001090  test-merror:0.032960+0.002702
## 
## [1]  train-merror:0.116186 
## [40] train-merror:0.007181 
## 
## Process for: Lincoln, NE and San Jose, CA 
## [1]  train-merror:0.107393+0.003962  test-merror:0.117284+0.007240 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [104]    train-merror:0.000175+0.000170  test-merror:0.027542+0.004349
## 
## [1]  train-merror:0.104213 
## [104]    train-merror:0.000117 
## 
## Process for: Lincoln, NE and Seattle, WA 
## [1]  train-merror:0.147946+0.006526  test-merror:0.164197+0.003085 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [82] train-merror:0.001079+0.000237  test-merror:0.035244+0.004824
## 
## [1]  train-merror:0.142257 
## [82] train-merror:0.001634 
## 
## Process for: Lincoln, NE and Tampa Bay, FL 
## [1]  train-merror:0.097633+0.006033  test-merror:0.108245+0.009953 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [93] train-merror:0.000000+0.000000  test-merror:0.014495+0.001456
## 
## [1]  train-merror:0.091409 
## [93] train-merror:0.000117 
## 
## Process for: Lincoln, NE and Traverse City, MI 
## [1]  train-merror:0.204076+0.005051  test-merror:0.224749+0.004603 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [54] train-merror:0.034202+0.001232  test-merror:0.113016+0.008152
## 
## [1]  train-merror:0.197691 
## [54] train-merror:0.041054 
## 
## Process for: Lincoln, NE and Washington, DC 
## [1]  train-merror:0.257762+0.006033  test-merror:0.279802+0.009003 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.014454+0.003185  test-merror:0.094457+0.004455
## 
## [1]  train-merror:0.250773 
## [75] train-merror:0.020937 
## 
## Process for: Los Angeles, CA and Madison, WI 
## [1]  train-merror:0.132235+0.002011  test-merror:0.141834+0.006589 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [66] train-merror:0.002751+0.000243  test-merror:0.026549+0.001758
## 
## [1]  train-merror:0.125807 
## [66] train-merror:0.003946 
## 
## Process for: Los Angeles, CA and Miami, FL 
## [1]  train-merror:0.035697+0.001341  test-merror:0.044763+0.005529 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.001535+0.000424  test-merror:0.013701+0.003890
## 
## [1]  train-merror:0.037203 
## [43] train-merror:0.002008 
## 
## Process for: Los Angeles, CA and Milwaukee, WI 
## [1]  train-merror:0.129149+0.003562  test-merror:0.144680+0.011858 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [64] train-merror:0.005492+0.000695  test-merror:0.030944+0.003771
## 
## [1]  train-merror:0.129444 
## [64] train-merror:0.005787 
## 
## Process for: Los Angeles, CA and Minneapolis, MN 
## [1]  train-merror:0.117131+0.002831  test-merror:0.127907+0.006837 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [35] train-merror:0.017214+0.001524  test-merror:0.036494+0.003176
## 
## [1]  train-merror:0.119050 
## [35] train-merror:0.014291 
## 
## Process for: Los Angeles, CA and New Orleans, LA 
## [1]  train-merror:0.084209+0.003080  test-merror:0.097200+0.009865 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [81] train-merror:0.000974+0.000472  test-merror:0.023149+0.002520
## 
## [1]  train-merror:0.083383 
## [81] train-merror:0.001063 
## 
## Process for: Los Angeles, CA and Newark, NJ 
## [1]  train-merror:0.122062+0.004273  test-merror:0.136529+0.007474 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [59] train-merror:0.008356+0.001244  test-merror:0.040510+0.002633
## 
## [1]  train-merror:0.124720 
## [59] train-merror:0.009685 
## 
## Process for: Los Angeles, CA and Philadelphia, PA 
## [1]  train-merror:0.132485+0.005707  test-merror:0.138891+0.011656 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [89] train-merror:0.001447+0.000432  test-merror:0.036258+0.002030
## 
## [1]  train-merror:0.130034 
## [89] train-merror:0.002244 
## 
## Process for: Los Angeles, CA and Phoenix, AZ 
## [1]  train-merror:0.096079+0.002777  test-merror:0.111492+0.007191 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.000502+0.000274  test-merror:0.023148+0.005137
## 
## [1]  train-merror:0.097437 
## [53] train-merror:0.000472 
## 
## Process for: Los Angeles, CA and Saint Louis, MO 
## [1]  train-merror:0.132101+0.011787  test-merror:0.150110+0.010429 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.000974+0.000200  test-merror:0.030826+0.001367
## 
## [1]  train-merror:0.124838 
## [72] train-merror:0.001299 
## 
## Process for: Los Angeles, CA and San Antonio, TX 
## [1]  train-merror:0.113706+0.006019  test-merror:0.121175+0.005869 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [74] train-merror:0.000000+0.000000  test-merror:0.005432+0.002343
## 
## [1]  train-merror:0.113381 
## [74] train-merror:0.000000 
## 
## Process for: Los Angeles, CA and San Diego, CA 
## [1]  train-merror:0.253690+0.013978  test-merror:0.270700+0.016080 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.049870+0.002806  test-merror:0.155428+0.007386
## 
## [1]  train-merror:0.250738 
## [60] train-merror:0.061887 
## 
## Process for: Los Angeles, CA and San Francisco, CA 
## [1]  train-merror:0.176922+0.001430  test-merror:0.193812+0.001755 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [70] train-merror:0.018868+0.002394  test-merror:0.082911+0.004815
## 
## [1]  train-merror:0.186725 
## [70] train-merror:0.020314 
## 
## Process for: Los Angeles, CA and San Jose, CA 
## [1]  train-merror:0.207984+0.002052  test-merror:0.220739+0.005251 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [112]    train-merror:0.007323+0.000955  test-merror:0.077711+0.005222
## 
## [1]  train-merror:0.207630 
## [112]    train-merror:0.010630 
## 
## Process for: Los Angeles, CA and Seattle, WA 
## [1]  train-merror:0.114651+0.006498  test-merror:0.124838+0.007927 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.000000+0.000000  test-merror:0.010630+0.002084
## 
## [1]  train-merror:0.117869 
## [78] train-merror:0.000000 
## 
## Process for: Los Angeles, CA and Tampa Bay, FL 
## [1]  train-merror:0.078452+0.001633  test-merror:0.097082+0.001752 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [56] train-merror:0.004252+0.000286  test-merror:0.028935+0.003502
## 
## [1]  train-merror:0.086926 
## [56] train-merror:0.004606 
## 
## Process for: Los Angeles, CA and Traverse City, MI 
## [1]  train-merror:0.135733+0.004318  test-merror:0.143499+0.007973 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.005433+0.000342  test-merror:0.033778+0.005218
## 
## [1]  train-merror:0.126727 
## [67] train-merror:0.007559 
## 
## Process for: Los Angeles, CA and Washington, DC 
## [1]  train-merror:0.102427+0.002458  test-merror:0.113372+0.005906 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [32] train-merror:0.011093+0.000726  test-merror:0.032834+0.003919
## 
## [1]  train-merror:0.104330 
## [32] train-merror:0.015465 
## 
## Process for: Madison, WI and Miami, FL 
## [1]  train-merror:0.055041+0.004688  test-merror:0.062668+0.012922 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [23] train-merror:0.003408+0.000346  test-merror:0.011601+0.003481
## 
## [1]  train-merror:0.045683 
## [23] train-merror:0.004903 
## 
## Process for: Madison, WI and Milwaukee, WI 
## [1]  train-merror:0.343130+0.001830  test-merror:0.370725+0.012329 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [35] train-merror:0.169158+0.006432  test-merror:0.325760+0.008859
## 
## [1]  train-merror:0.351710 
## [35] train-merror:0.183808 
## 
## Process for: Madison, WI and Minneapolis, MN 
## [1]  train-merror:0.340917+0.004690  test-merror:0.368331+0.008804 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.097076+0.002870  test-merror:0.271820+0.014750
## 
## [1]  train-merror:0.338555 
## [60] train-merror:0.115403 
## 
## Process for: Madison, WI and New Orleans, LA 
## [1]  train-merror:0.112832+0.002845  test-merror:0.120307+0.008572 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.000628+0.000467  test-merror:0.012558+0.003193
## 
## [1]  train-merror:0.119230 
## [62] train-merror:0.000359 
## 
## Process for: Madison, WI and Newark, NJ 
## [1]  train-merror:0.309615+0.011151  test-merror:0.328272+0.008384 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [104]    train-merror:0.014082+0.001542  test-merror:0.135972+0.003500
## 
## [1]  train-merror:0.282110 
## [104]    train-merror:0.019732 
## 
## Process for: Madison, WI and Philadelphia, PA 
## [1]  train-merror:0.305668+0.011883  test-merror:0.324803+0.010878 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [107]    train-merror:0.010913+0.000954  test-merror:0.128078+0.007188
## 
## [1]  train-merror:0.299091 
## [107]    train-merror:0.016025 
## 
## Process for: Madison, WI and Phoenix, AZ 
## [1]  train-merror:0.065654+0.002413  test-merror:0.073190+0.008201 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.000000+0.000000  test-merror:0.008251+0.001330
## 
## [1]  train-merror:0.066611 
## [75] train-merror:0.000000 
## 
## Process for: Madison, WI and Saint Louis, MO 
## [1]  train-merror:0.317806+0.005539  test-merror:0.334845+0.012369 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [45] train-merror:0.051513+0.002461  test-merror:0.146615+0.010620
## 
## [1]  train-merror:0.299928 
## [45] train-merror:0.050945 
## 
## Process for: Madison, WI and San Antonio, TX 
## [1]  train-merror:0.166438+0.027115  test-merror:0.175195+0.025140 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [88] train-merror:0.000000+0.000000  test-merror:0.004425+0.001632
## 
## [1]  train-merror:0.145420 
## [88] train-merror:0.000000 
## 
## Process for: Madison, WI and San Diego, CA 
## [1]  train-merror:0.120097+0.004877  test-merror:0.128918+0.007252 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [79] train-merror:0.000448+0.000189  test-merror:0.019732+0.002855
## 
## [1]  train-merror:0.114446 
## [79] train-merror:0.001315 
## 
## Process for: Madison, WI and San Francisco, CA 
## [1]  train-merror:0.112174+0.008230  test-merror:0.118395+0.007603 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [85] train-merror:0.001136+0.000429  test-merror:0.025711+0.003724
## 
## [1]  train-merror:0.117077 
## [85] train-merror:0.001196 
## 
## Process for: Madison, WI and San Jose, CA 
## [1]  train-merror:0.133550+0.005134  test-merror:0.145299+0.009070 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [51] train-merror:0.012079+0.001113  test-merror:0.041496+0.003707
## 
## [1]  train-merror:0.125688 
## [51] train-merror:0.013514 
## 
## Process for: Madison, WI and Seattle, WA 
## [1]  train-merror:0.177231+0.002693  test-merror:0.198517+0.002576 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [83] train-merror:0.002153+0.000870  test-merror:0.048314+0.004954
## 
## [1]  train-merror:0.196843 
## [83] train-merror:0.004066 
## 
## Process for: Madison, WI and Tampa Bay, FL 
## [1]  train-merror:0.089901+0.002920  test-merror:0.097464+0.004832 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.000359+0.000243  test-merror:0.013155+0.002756
## 
## [1]  train-merror:0.091605 
## [67] train-merror:0.000718 
## 
## Process for: Madison, WI and Traverse City, MI 
## [1]  train-merror:0.349348+0.003361  test-merror:0.369645+0.013548 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.083563+0.003603  test-merror:0.244797+0.009538
## 
## [1]  train-merror:0.348960 
## [61] train-merror:0.106673 
## 
## Process for: Madison, WI and Washington, DC 
## [1]  train-merror:0.325281+0.005274  test-merror:0.336044+0.009094 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [54] train-merror:0.041438+0.003595  test-merror:0.133581+0.009240
## 
## [1]  train-merror:0.328151 
## [54] train-merror:0.043291 
## 
## Process for: Miami, FL and Milwaukee, WI 
## [1]  train-merror:0.043339+0.009204  test-merror:0.044953+0.013349 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [37] train-merror:0.000264+0.000171  test-merror:0.006925+0.002445
## 
## [1]  train-merror:0.035798 
## [37] train-merror:0.000235 
## 
## Process for: Miami, FL and Minneapolis, MN 
## [1]  train-merror:0.043424+0.004959  test-merror:0.048449+0.007274 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.000000+0.000000  test-merror:0.006739+0.002034
## 
## [1]  train-merror:0.034972 
## [53] train-merror:0.000116 
## 
## Process for: Miami, FL and New Orleans, LA 
## [1]  train-merror:0.261767+0.002527  test-merror:0.280060+0.014579 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [88] train-merror:0.006732+0.000444  test-merror:0.049526+0.003481
## 
## [1]  train-merror:0.258284 
## [88] train-merror:0.008547 
## 
## Process for: Miami, FL and Newark, NJ 
## [1]  train-merror:0.066750+0.008606  test-merror:0.077785+0.008994 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [62] train-merror:0.000786+0.000254  test-merror:0.012342+0.002586
## 
## [1]  train-merror:0.066139 
## [62] train-merror:0.001048 
## 
## Process for: Miami, FL and Philadelphia, PA 
## [1]  train-merror:0.095697+0.003855  test-merror:0.106395+0.007607 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [46] train-merror:0.001871+0.000407  test-merror:0.018122+0.001848
## 
## [1]  train-merror:0.094353 
## [46] train-merror:0.002806 
## 
## Process for: Miami, FL and Phoenix, AZ 
## [1]  train-merror:0.018245+0.001618  test-merror:0.027500+0.002635 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [32] train-merror:0.000029+0.000059  test-merror:0.006581+0.001076
## 
## [1]  train-merror:0.018451 
## [32] train-merror:0.000118 
## 
## Process for: Miami, FL and Saint Louis, MO 
## [1]  train-merror:0.099068+0.002773  test-merror:0.106368+0.004198 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [59] train-merror:0.000029+0.000059  test-merror:0.006685+0.001682
## 
## [1]  train-merror:0.096517 
## [59] train-merror:0.000000 
## 
## Process for: Miami, FL and San Antonio, TX 
## [1]  train-merror:0.129029+0.005615  test-merror:0.139924+0.007779 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [55] train-merror:0.000000+0.000000  test-merror:0.001771+0.000835
## 
## [1]  train-merror:0.129885 
## [55] train-merror:0.000000 
## 
## Process for: Miami, FL and San Diego, CA 
## [1]  train-merror:0.048856+0.000996  test-merror:0.057009+0.002311 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [70] train-merror:0.000850+0.000171  test-merror:0.017242+0.003714
## 
## [1]  train-merror:0.060059 
## [70] train-merror:0.001877 
## 
## Process for: Miami, FL and San Francisco, CA 
## [1]  train-merror:0.029164+0.002751  test-merror:0.037787+0.003712 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [31] train-merror:0.002913+0.000627  test-merror:0.017304+0.002570
## 
## [1]  train-merror:0.026839 
## [31] train-merror:0.003296 
## 
## Process for: Miami, FL and San Jose, CA 
## [1]  train-merror:0.037490+0.001739  test-merror:0.043996+0.004201 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [50] train-merror:0.002392+0.000286  test-merror:0.018789+0.003450
## 
## [1]  train-merror:0.035360 
## [50] train-merror:0.003501 
## 
## Process for: Miami, FL and Seattle, WA 
## [1]  train-merror:0.025995+0.001345  test-merror:0.028242+0.004333 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.000146+0.000000  test-merror:0.005368+0.001132
## 
## [1]  train-merror:0.026841 
## [43] train-merror:0.000117 
## 
## Process for: Miami, FL and Tampa Bay, FL 
## [1]  train-merror:0.262011+0.006567  test-merror:0.282291+0.005762 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [82] train-merror:0.054004+0.000918  test-merror:0.184103+0.009504
## 
## [1]  train-merror:0.257393 
## [82] train-merror:0.062653 
## 
## Process for: Miami, FL and Traverse City, MI 
## [1]  train-merror:0.030208+0.002765  test-merror:0.033229+0.005340 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [30] train-merror:0.000842+0.000323  test-merror:0.008017+0.001443
## 
## [1]  train-merror:0.028233 
## [30] train-merror:0.000697 
## 
## Process for: Miami, FL and Washington, DC 
## [1]  train-merror:0.105282+0.002401  test-merror:0.118486+0.008783 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [35] train-merror:0.007703+0.000928  test-merror:0.026172+0.004697
## 
## [1]  train-merror:0.100048 
## [35] train-merror:0.007376 
## 
## Process for: Milwaukee, WI and Minneapolis, MN 
## [1]  train-merror:0.389025+0.010031  test-merror:0.413502+0.012894 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [77] train-merror:0.063057+0.002191  test-merror:0.244128+0.011117
## 
## [1]  train-merror:0.404577 
## [77] train-merror:0.072770 
## 
## Process for: Milwaukee, WI and New Orleans, LA 
## [1]  train-merror:0.122946+0.002855  test-merror:0.127932+0.007972 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.001907+0.000185  test-merror:0.016315+0.003002
## 
## [1]  train-merror:0.119131 
## [39] train-merror:0.001995 
## 
## Process for: Milwaukee, WI and Newark, NJ 
## [1]  train-merror:0.322330+0.005410  test-merror:0.342018+0.006258 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.032893+0.002046  test-merror:0.166314+0.011389
## 
## [1]  train-merror:0.334624 
## [78] train-merror:0.042371 
## 
## Process for: Milwaukee, WI and Philadelphia, PA 
## [1]  train-merror:0.319425+0.008072  test-merror:0.336267+0.021098 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [47] train-merror:0.074120+0.002132  test-merror:0.180282+0.007167
## 
## [1]  train-merror:0.308803 
## [47] train-merror:0.081690 
## 
## Process for: Milwaukee, WI and Phoenix, AZ 
## [1]  train-merror:0.056029+0.001339  test-merror:0.069340+0.005767 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [73] train-merror:0.000000+0.000000  test-merror:0.007639+0.001488
## 
## [1]  train-merror:0.064872 
## [73] train-merror:0.000000 
## 
## Process for: Milwaukee, WI and Saint Louis, MO 
## [1]  train-merror:0.306338+0.013560  test-merror:0.325819+0.015068 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [96] train-merror:0.011180+0.001005  test-merror:0.117369+0.005596
## 
## [1]  train-merror:0.318662 
## [96] train-merror:0.015141 
## 
## Process for: Milwaukee, WI and San Antonio, TX 
## [1]  train-merror:0.146328+0.003605  test-merror:0.157633+0.009268 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [77] train-merror:0.000000+0.000000  test-merror:0.004723+0.002010
## 
## [1]  train-merror:0.147951 
## [77] train-merror:0.000000 
## 
## Process for: Milwaukee, WI and San Diego, CA 
## [1]  train-merror:0.114847+0.008721  test-merror:0.131692+0.018077 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [37] train-merror:0.007717+0.000545  test-merror:0.025234+0.003480
## 
## [1]  train-merror:0.108333 
## [37] train-merror:0.011385 
## 
## Process for: Milwaukee, WI and San Francisco, CA 
## [1]  train-merror:0.110624+0.003551  test-merror:0.120542+0.010827 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.001089+0.000578  test-merror:0.026955+0.004386
## 
## [1]  train-merror:0.114303 
## [78] train-merror:0.002590 
## 
## Process for: Milwaukee, WI and San Jose, CA 
## [1]  train-merror:0.126643+0.002462  test-merror:0.135914+0.009960 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [65] train-merror:0.008685+0.001215  test-merror:0.041665+0.006736
## 
## [1]  train-merror:0.129812 
## [65] train-merror:0.009859 
## 
## Process for: Milwaukee, WI and Seattle, WA 
## [1]  train-merror:0.178756+0.002084  test-merror:0.191667+0.006349 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [124]    train-merror:0.000147+0.000161  test-merror:0.042488+0.003667
## 
## [1]  train-merror:0.178639 
## [124]    train-merror:0.000000 
## 
## Process for: Milwaukee, WI and Tampa Bay, FL 
## [1]  train-merror:0.098709+0.006824  test-merror:0.107163+0.010523 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [41] train-merror:0.001409+0.000513  test-merror:0.013615+0.002386
## 
## [1]  train-merror:0.108333 
## [41] train-merror:0.001643 
## 
## Process for: Milwaukee, WI and Traverse City, MI 
## [1]  train-merror:0.298357+0.006259  test-merror:0.325825+0.012909 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.103521+0.005380  test-merror:0.228168+0.007298
## 
## [1]  train-merror:0.306690 
## [43] train-merror:0.111972 
## 
## Process for: Milwaukee, WI and Washington, DC 
## [1]  train-merror:0.315400+0.005567  test-merror:0.331905+0.008424 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [68] train-merror:0.030484+0.002398  test-merror:0.125744+0.008243
## 
## [1]  train-merror:0.324530 
## [68] train-merror:0.034023 
## 
## Process for: Minneapolis, MN and New Orleans, LA 
## [1]  train-merror:0.122088+0.006988  test-merror:0.135116+0.008086 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.000293+0.000307  test-merror:0.010890+0.003266
## 
## [1]  train-merror:0.111697 
## [63] train-merror:0.000117 
## 
## Process for: Minneapolis, MN and Newark, NJ 
## [1]  train-merror:0.313693+0.010443  test-merror:0.333486+0.014358 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [158]    train-merror:0.001630+0.000578  test-merror:0.108174+0.004901
## 
## [1]  train-merror:0.325454 
## [158]    train-merror:0.001747 
## 
## Process for: Minneapolis, MN and Philadelphia, PA 
## [1]  train-merror:0.308576+0.003836  test-merror:0.329946+0.016989 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [67] train-merror:0.029405+0.002167  test-merror:0.130481+0.010142
## 
## [1]  train-merror:0.332632 
## [67] train-merror:0.036362 
## 
## Process for: Minneapolis, MN and Phoenix, AZ 
## [1]  train-merror:0.075068+0.007662  test-merror:0.085088+0.009041 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [92] train-merror:0.000000+0.000000  test-merror:0.006464+0.002262
## 
## [1]  train-merror:0.090022 
## [92] train-merror:0.000000 
## 
## Process for: Minneapolis, MN and Saint Louis, MO 
## [1]  train-merror:0.283834+0.004396  test-merror:0.303975+0.007423 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [91] train-merror:0.010525+0.002256  test-merror:0.095226+0.008289
## 
## [1]  train-merror:0.305969 
## [91] train-merror:0.013604 
## 
## Process for: Minneapolis, MN and San Antonio, TX 
## [1]  train-merror:0.147891+0.011503  test-merror:0.155507+0.008028 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [69] train-merror:0.000000+0.000000  test-merror:0.004960+0.001273
## 
## [1]  train-merror:0.144409 
## [69] train-merror:0.000118 
## 
## Process for: Minneapolis, MN and San Diego, CA 
## [1]  train-merror:0.106217+0.009650  test-merror:0.111790+0.011988 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [92] train-merror:0.000381+0.000287  test-merror:0.018534+0.003473
## 
## [1]  train-merror:0.099707 
## [92] train-merror:0.000821 
## 
## Process for: Minneapolis, MN and San Francisco, CA 
## [1]  train-merror:0.120747+0.002830  test-merror:0.130901+0.004445 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [64] train-merror:0.002913+0.000741  test-merror:0.024485+0.003061
## 
## [1]  train-merror:0.123014 
## [64] train-merror:0.005180 
## 
## Process for: Minneapolis, MN and San Jose, CA 
## [1]  train-merror:0.138902+0.003207  test-merror:0.144590+0.009649 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.006623+0.000602  test-merror:0.041428+0.004749
## 
## [1]  train-merror:0.140040 
## [75] train-merror:0.011320 
## 
## Process for: Minneapolis, MN and Seattle, WA 
## [1]  train-merror:0.159675+0.002296  test-merror:0.169914+0.003380 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [135]    train-merror:0.000000+0.000000  test-merror:0.031859+0.003104
## 
## [1]  train-merror:0.165597 
## [135]    train-merror:0.000117 
## 
## Process for: Minneapolis, MN and Tampa Bay, FL 
## [1]  train-merror:0.086178+0.002271  test-merror:0.093981+0.005446 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [29] train-merror:0.004734+0.000301  test-merror:0.014962+0.001977
## 
## [1]  train-merror:0.081005 
## [29] train-merror:0.006429 
## 
## Process for: Minneapolis, MN and Traverse City, MI 
## [1]  train-merror:0.296602+0.002499  test-merror:0.315387+0.006579 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [88] train-merror:0.041160+0.000687  test-merror:0.193701+0.006339
## 
## [1]  train-merror:0.300336 
## [88] train-merror:0.065764 
## 
## Process for: Minneapolis, MN and Washington, DC 
## [1]  train-merror:0.290923+0.007155  test-merror:0.306090+0.007895 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [125]    train-merror:0.006483+0.000732  test-merror:0.092672+0.004156
## 
## [1]  train-merror:0.304544 
## [125]    train-merror:0.007376 
## 
## Process for: New Orleans, LA and Newark, NJ 
## [1]  train-merror:0.142782+0.010368  test-merror:0.154900+0.006676 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [54] train-merror:0.003600+0.000709  test-merror:0.028451+0.001718
## 
## [1]  train-merror:0.135464 
## [54] train-merror:0.003981 
## 
## Process for: New Orleans, LA and Philadelphia, PA 
## [1]  train-merror:0.169945+0.002737  test-merror:0.183354+0.015091 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [59] train-merror:0.003308+0.000678  test-merror:0.038052+0.002776
## 
## [1]  train-merror:0.138040 
## [59] train-merror:0.003161 
## 
## Process for: New Orleans, LA and Phoenix, AZ 
## [1]  train-merror:0.045041+0.002166  test-merror:0.052297+0.005977 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.000000+0.000000  test-merror:0.008109+0.002612
## 
## [1]  train-merror:0.044189 
## [63] train-merror:0.000000 
## 
## Process for: New Orleans, LA and Saint Louis, MO 
## [1]  train-merror:0.166266+0.006179  test-merror:0.178257+0.010278 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [64] train-merror:0.000352+0.000117  test-merror:0.012666+0.002152
## 
## [1]  train-merror:0.179313 
## [64] train-merror:0.000469 
## 
## Process for: New Orleans, LA and San Antonio, TX 
## [1]  train-merror:0.165102+0.005560  test-merror:0.169205+0.006437 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [38] train-merror:0.000148+0.000132  test-merror:0.002952+0.001446
## 
## [1]  train-merror:0.160350 
## [38] train-merror:0.000118 
## 
## Process for: New Orleans, LA and San Diego, CA 
## [1]  train-merror:0.099502+0.004291  test-merror:0.108739+0.012222 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [69] train-merror:0.002375+0.000409  test-merror:0.025924+0.005594
## 
## [1]  train-merror:0.103695 
## [69] train-merror:0.003167 
## 
## Process for: New Orleans, LA and San Francisco, CA 
## [1]  train-merror:0.078193+0.002521  test-merror:0.093583+0.005605 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [113]    train-merror:0.000059+0.000072  test-merror:0.031312+0.003590
## 
## [1]  train-merror:0.079341 
## [113]    train-merror:0.000706 
## 
## Process for: New Orleans, LA and San Jose, CA 
## [1]  train-merror:0.079499+0.004594  test-merror:0.089100+0.005925 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [100]    train-merror:0.000176+0.000215  test-merror:0.025172+0.002368
## 
## [1]  train-merror:0.080318 
## [100]    train-merror:0.000585 
## 
## Process for: New Orleans, LA and Seattle, WA 
## [1]  train-merror:0.100984+0.003782  test-merror:0.108769+0.006641 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [35] train-merror:0.007201+0.001057  test-merror:0.025876+0.002928
## 
## [1]  train-merror:0.103735 
## [35] train-merror:0.008079 
## 
## Process for: New Orleans, LA and Tampa Bay, FL 
## [1]  train-merror:0.324816+0.005490  test-merror:0.353118+0.007889 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [109]    train-merror:0.002049+0.000404  test-merror:0.036410+0.007054
## 
## [1]  train-merror:0.311322 
## [109]    train-merror:0.003512 
## 
## Process for: New Orleans, LA and Traverse City, MI 
## [1]  train-merror:0.097150+0.007396  test-merror:0.107126+0.010671 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [61] train-merror:0.000088+0.000072  test-merror:0.012176+0.001827
## 
## [1]  train-merror:0.095539 
## [61] train-merror:0.000117 
## 
## Process for: New Orleans, LA and Washington, DC 
## [1]  train-merror:0.179455+0.009434  test-merror:0.194267+0.013927 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [68] train-merror:0.004848+0.000812  test-merror:0.049133+0.006711
## 
## [1]  train-merror:0.193433 
## [68] train-merror:0.007495 
## 
## Process for: Newark, NJ and Philadelphia, PA 
## [1]  train-merror:0.401964+0.004704  test-merror:0.438678+0.007361 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [38] train-merror:0.189085+0.007493  test-merror:0.378465+0.005328
## 
## [1]  train-merror:0.412604 
## [38] train-merror:0.204840 
## 
## Process for: Newark, NJ and Phoenix, AZ 
## [1]  train-merror:0.088700+0.001940  test-merror:0.104359+0.007927 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [99] train-merror:0.000000+0.000000  test-merror:0.011517+0.002796
## 
## [1]  train-merror:0.096956 
## [99] train-merror:0.000000 
## 
## Process for: Newark, NJ and Saint Louis, MO 
## [1]  train-merror:0.338923+0.009989  test-merror:0.363903+0.011389 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [123]    train-merror:0.004193+0.001048  test-merror:0.072006+0.005535
## 
## [1]  train-merror:0.324616 
## [123]    train-merror:0.006333 
## 
## Process for: Newark, NJ and San Antonio, TX 
## [1]  train-merror:0.170918+0.003801  test-merror:0.187744+0.006355 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [90] train-merror:0.000000+0.000000  test-merror:0.005668+0.001029
## 
## [1]  train-merror:0.189515 
## [90] train-merror:0.000000 
## 
## Process for: Newark, NJ and San Diego, CA 
## [1]  train-merror:0.111026+0.001770  test-merror:0.121292+0.007754 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [45] train-merror:0.008563+0.000739  test-merror:0.031437+0.002815
## 
## [1]  train-merror:0.103460 
## [45] train-merror:0.008798 
## 
## Process for: Newark, NJ and San Francisco, CA 
## [1]  train-merror:0.117422+0.004278  test-merror:0.129723+0.008161 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.004561+0.001387  test-merror:0.039082+0.004933
## 
## [1]  train-merror:0.114420 
## [72] train-merror:0.005062 
## 
## Process for: Newark, NJ and San Jose, CA 
## [1]  train-merror:0.129770+0.003564  test-merror:0.139107+0.007157 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [80] train-merror:0.006798+0.001440  test-merror:0.049247+0.002164
## 
## [1]  train-merror:0.132571 
## [80] train-merror:0.010970 
## 
## Process for: Newark, NJ and Seattle, WA 
## [1]  train-merror:0.202445+0.006748  test-merror:0.215078+0.008611 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [153]    train-merror:0.000000+0.000000  test-merror:0.013302+0.004478
## 
## [1]  train-merror:0.213677 
## [153]    train-merror:0.000000 
## 
## Process for: Newark, NJ and Tampa Bay, FL 
## [1]  train-merror:0.106254+0.005231  test-merror:0.118060+0.009575 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [48] train-merror:0.005582+0.000818  test-merror:0.025833+0.002624
## 
## [1]  train-merror:0.104150 
## [48] train-merror:0.006663 
## 
## Process for: Newark, NJ and Traverse City, MI 
## [1]  train-merror:0.269125+0.000980  test-merror:0.298206+0.003133 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [101]    train-merror:0.016069+0.002515  test-merror:0.138333+0.003504
## 
## [1]  train-merror:0.277480 
## [101]    train-merror:0.023638 
## 
## Process for: Newark, NJ and Washington, DC 
## [1]  train-merror:0.353765+0.003437  test-merror:0.384726+0.015681 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [55] train-merror:0.087021+0.002704  test-merror:0.248633+0.005977
## 
## [1]  train-merror:0.366048 
## [55] train-merror:0.103379 
## 
## Process for: Philadelphia, PA and Phoenix, AZ 
## [1]  train-merror:0.098837+0.002781  test-merror:0.110472+0.006817 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [58] train-merror:0.000147+0.000093  test-merror:0.015865+0.002651
## 
## [1]  train-merror:0.093078 
## [58] train-merror:0.000118 
## 
## Process for: Philadelphia, PA and Saint Louis, MO 
## [1]  train-merror:0.341210+0.003229  test-merror:0.364255+0.011955 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [114]    train-merror:0.005600+0.000536  test-merror:0.072007+0.004880
## 
## [1]  train-merror:0.339158 
## [114]    train-merror:0.009147 
## 
## Process for: Philadelphia, PA and San Antonio, TX 
## [1]  train-merror:0.212924+0.011165  test-merror:0.236152+0.017399 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [89] train-merror:0.000000+0.000000  test-merror:0.006966+0.001881
## 
## [1]  train-merror:0.225056 
## [89] train-merror:0.000000 
## 
## Process for: Philadelphia, PA and San Diego, CA 
## [1]  train-merror:0.101759+0.004393  test-merror:0.111085+0.004788 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [98] train-merror:0.000851+0.000408  test-merror:0.029091+0.003432
## 
## [1]  train-merror:0.102757 
## [98] train-merror:0.001290 
## 
## Process for: Philadelphia, PA and San Francisco, CA 
## [1]  train-merror:0.120335+0.003594  test-merror:0.133375+0.009441 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [60] train-merror:0.006887+0.000909  test-merror:0.036846+0.003758
## 
## [1]  train-merror:0.116775 
## [60] train-merror:0.008005 
## 
## Process for: Philadelphia, PA and San Jose, CA 
## [1]  train-merror:0.127791+0.003525  test-merror:0.137496+0.007730 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [35] train-merror:0.023442+0.001813  test-merror:0.056589+0.004897
## 
## [1]  train-merror:0.133520 
## [35] train-merror:0.025956 
## 
## Process for: Philadelphia, PA and Seattle, WA 
## [1]  train-merror:0.191804+0.006901  test-merror:0.199696+0.008167 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [96] train-merror:0.000117+0.000058  test-merror:0.016252+0.002007
## 
## [1]  train-merror:0.191512 
## [96] train-merror:0.000117 
## 
## Process for: Philadelphia, PA and Tampa Bay, FL 
## [1]  train-merror:0.114287+0.003144  test-merror:0.122884+0.012221 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [63] train-merror:0.002952+0.000587  test-merror:0.030281+0.004679
## 
## [1]  train-merror:0.118438 
## [63] train-merror:0.003508 
## 
## Process for: Philadelphia, PA and Traverse City, MI 
## [1]  train-merror:0.254238+0.004995  test-merror:0.275107+0.005978 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [70] train-merror:0.031275+0.001225  test-merror:0.131766+0.007604
## 
## [1]  train-merror:0.262013 
## [70] train-merror:0.037531 
## 
## Process for: Philadelphia, PA and Washington, DC 
## [1]  train-merror:0.359238+0.003314  test-merror:0.390554+0.010362 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [32] train-merror:0.158547+0.004485  test-merror:0.301689+0.008371
## 
## [1]  train-merror:0.354271 
## [32] train-merror:0.183321 
## 
## Process for: Phoenix, AZ and Saint Louis, MO 
## [1]  train-merror:0.101892+0.006063  test-merror:0.118464+0.004810 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [58] train-merror:0.000911+0.000364  test-merror:0.024563+0.002563
## 
## [1]  train-merror:0.105183 
## [58] train-merror:0.001410 
## 
## Process for: Phoenix, AZ and San Antonio, TX 
## [1]  train-merror:0.083835+0.004001  test-merror:0.098831+0.003584 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [59] train-merror:0.002273+0.000200  test-merror:0.036959+0.004940
## 
## [1]  train-merror:0.080411 
## [59] train-merror:0.003306 
## 
## Process for: Phoenix, AZ and San Diego, CA 
## [1]  train-merror:0.088730+0.004783  test-merror:0.098719+0.007214 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [78] train-merror:0.000000+0.000000  test-merror:0.010577+0.000746
## 
## [1]  train-merror:0.087202 
## [78] train-merror:0.000000 
## 
## Process for: Phoenix, AZ and San Francisco, CA 
## [1]  train-merror:0.052413+0.004073  test-merror:0.062625+0.007049 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [58] train-merror:0.000088+0.000118  test-merror:0.018128+0.002086
## 
## [1]  train-merror:0.052384 
## [58] train-merror:0.000118 
## 
## Process for: Phoenix, AZ and San Jose, CA 
## [1]  train-merror:0.063227+0.001266  test-merror:0.076977+0.002648 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [65] train-merror:0.000118+0.000171  test-merror:0.015866+0.001933
## 
## [1]  train-merror:0.068633 
## [65] train-merror:0.000470 
## 
## Process for: Phoenix, AZ and Seattle, WA 
## [1]  train-merror:0.055618+0.005247  test-merror:0.065813+0.010102 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [50] train-merror:0.000000+0.000000  test-merror:0.006229+0.001559
## 
## [1]  train-merror:0.056176 
## [50] train-merror:0.000000 
## 
## Process for: Phoenix, AZ and Tampa Bay, FL 
## [1]  train-merror:0.029674+0.001376  test-merror:0.036315+0.002869 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [41] train-merror:0.000029+0.000059  test-merror:0.009167+0.001469
## 
## [1]  train-merror:0.034787 
## [41] train-merror:0.000235 
## 
## Process for: Phoenix, AZ and Traverse City, MI 
## [1]  train-merror:0.052298+0.002471  test-merror:0.058526+0.002694 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [41] train-merror:0.000265+0.000269  test-merror:0.009990+0.002102
## 
## [1]  train-merror:0.057586 
## [41] train-merror:0.000588 
## 
## Process for: Phoenix, AZ and Washington, DC 
## [1]  train-merror:0.090917+0.005422  test-merror:0.107659+0.008876 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [97] train-merror:0.000000+0.000000  test-merror:0.011540+0.001391
## 
## [1]  train-merror:0.096241 
## [97] train-merror:0.000000 
## 
## Process for: Saint Louis, MO and San Antonio, TX 
## [1]  train-merror:0.272582+0.005530  test-merror:0.286456+0.005694 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [101]    train-merror:0.000000+0.000000  test-merror:0.029637+0.004515
## 
## [1]  train-merror:0.268981 
## [101]    train-merror:0.000118 
## 
## Process for: Saint Louis, MO and San Diego, CA 
## [1]  train-merror:0.106979+0.001119  test-merror:0.117419+0.005024 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [77] train-merror:0.000147+0.000093  test-merror:0.014663+0.003709
## 
## [1]  train-merror:0.101466 
## [77] train-merror:0.000117 
## 
## Process for: Saint Louis, MO and San Francisco, CA 
## [1]  train-merror:0.126486+0.004375  test-merror:0.139965+0.006926 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [82] train-merror:0.000530+0.000343  test-merror:0.032137+0.005822
## 
## [1]  train-merror:0.121836 
## [82] train-merror:0.000942 
## 
## Process for: Saint Louis, MO and San Jose, CA 
## [1]  train-merror:0.133282+0.003603  test-merror:0.142141+0.013214 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.002785+0.000425  test-merror:0.037998+0.007298
## 
## [1]  train-merror:0.138384 
## [72] train-merror:0.003987 
## 
## Process for: Saint Louis, MO and Seattle, WA 
## [1]  train-merror:0.171719+0.007966  test-merror:0.181894+0.009796 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [85] train-merror:0.000498+0.000287  test-merror:0.033775+0.004204
## 
## [1]  train-merror:0.170517 
## [85] train-merror:0.000938 
## 
## Process for: Saint Louis, MO and Tampa Bay, FL 
## [1]  train-merror:0.139908+0.010098  test-merror:0.154688+0.017624 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.000000+0.000000  test-merror:0.012431+0.001501
## 
## [1]  train-merror:0.138267 
## [72] train-merror:0.000235 
## 
## Process for: Saint Louis, MO and Traverse City, MI 
## [1]  train-merror:0.258297+0.012488  test-merror:0.281459+0.005041 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [121]    train-merror:0.001964+0.000431  test-merror:0.084320+0.003259
## 
## [1]  train-merror:0.238419 
## [121]    train-merror:0.002228 
## 
## Process for: Saint Louis, MO and Washington, DC 
## [1]  train-merror:0.315311+0.006148  test-merror:0.343921+0.010008 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [104]    train-merror:0.008149+0.001070  test-merror:0.081251+0.005387
## 
## [1]  train-merror:0.340828 
## [104]    train-merror:0.012610 
## 
## Process for: San Antonio, TX and San Diego, CA 
## [1]  train-merror:0.125310+0.009047  test-merror:0.130710+0.007487 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [43] train-merror:0.000118+0.000110  test-merror:0.004015+0.001145
## 
## [1]  train-merror:0.122210 
## [43] train-merror:0.000236 
## 
## Process for: San Antonio, TX and San Francisco, CA 
## [1]  train-merror:0.118432+0.006672  test-merror:0.134018+0.006378 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [88] train-merror:0.000000+0.000000  test-merror:0.002834+0.001367
## 
## [1]  train-merror:0.114653 
## [88] train-merror:0.000000 
## 
## Process for: San Antonio, TX and San Jose, CA 
## [1]  train-merror:0.114211+0.001881  test-merror:0.123626+0.008178 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.000000+0.000000  test-merror:0.005904+0.002144
## 
## [1]  train-merror:0.118432 
## [71] train-merror:0.000000 
## 
## Process for: San Antonio, TX and Seattle, WA 
## [1]  train-merror:0.109635+0.007095  test-merror:0.117610+0.012336 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [48] train-merror:0.000000+0.000000  test-merror:0.003778+0.001095
## 
## [1]  train-merror:0.097532 
## [48] train-merror:0.000000 
## 
## Process for: San Antonio, TX and Tampa Bay, FL 
## [1]  train-merror:0.196688+0.003132  test-merror:0.211124+0.004820 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [55] train-merror:0.000000+0.000000  test-merror:0.003188+0.000709
## 
## [1]  train-merror:0.193057 
## [55] train-merror:0.000118 
## 
## Process for: San Antonio, TX and Traverse City, MI 
## [1]  train-merror:0.138239+0.014099  test-merror:0.147835+0.011333 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [76] train-merror:0.000000+0.000000  test-merror:0.004251+0.001601
## 
## [1]  train-merror:0.116661 
## [76] train-merror:0.000000 
## 
## Process for: San Antonio, TX and Washington, DC 
## [1]  train-merror:0.205062+0.001354  test-merror:0.222100+0.011420 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.000000+0.000000  test-merror:0.006543+0.002228
## 
## [1]  train-merror:0.198430 
## [75] train-merror:0.000000 
## 
## Process for: San Diego, CA and San Francisco, CA 
## [1]  train-merror:0.138728+0.003470  test-merror:0.154327+0.006533 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [38] train-merror:0.026663+0.000398  test-merror:0.070042+0.003587
## 
## [1]  train-merror:0.137375 
## [38] train-merror:0.029665 
## 
## Process for: San Diego, CA and San Jose, CA 
## [1]  train-merror:0.240147+0.008258  test-merror:0.251730+0.011174 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [75] train-merror:0.017009+0.000932  test-merror:0.080938+0.007391
## 
## [1]  train-merror:0.219238 
## [75] train-merror:0.020645 
## 
## Process for: San Diego, CA and Seattle, WA 
## [1]  train-merror:0.093519+0.002511  test-merror:0.104282+0.011019 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [72] train-merror:0.000147+0.000093  test-merror:0.012082+0.004476
## 
## [1]  train-merror:0.092434 
## [72] train-merror:0.000117 
## 
## Process for: San Diego, CA and Tampa Bay, FL 
## [1]  train-merror:0.086510+0.001478  test-merror:0.098416+0.006403 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.003284+0.000812  test-merror:0.029560+0.001673
## 
## [1]  train-merror:0.085396 
## [71] train-merror:0.004457 
## 
## Process for: San Diego, CA and Traverse City, MI 
## [1]  train-merror:0.104692+0.003881  test-merror:0.114371+0.007696 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [85] train-merror:0.001584+0.000458  test-merror:0.020410+0.003109
## 
## [1]  train-merror:0.113314 
## [85] train-merror:0.001642 
## 
## Process for: San Diego, CA and Washington, DC 
## [1]  train-merror:0.108494+0.005193  test-merror:0.116228+0.007665 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [57] train-merror:0.002528+0.000451  test-merror:0.026529+0.003119
## 
## [1]  train-merror:0.105282 
## [57] train-merror:0.003212 
## 
## Process for: San Francisco, CA and San Jose, CA 
## [1]  train-merror:0.250766+0.008463  test-merror:0.264388+0.007883 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [39] train-merror:0.085138+0.005035  test-merror:0.171159+0.005039
## 
## [1]  train-merror:0.239317 
## [39] train-merror:0.090995 
## 
## Process for: San Francisco, CA and Seattle, WA 
## [1]  train-merror:0.155974+0.010484  test-merror:0.166688+0.014858 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [91] train-merror:0.000294+0.000131  test-merror:0.023542+0.003424
## 
## [1]  train-merror:0.140906 
## [91] train-merror:0.000235 
## 
## Process for: San Francisco, CA and Tampa Bay, FL 
## [1]  train-merror:0.076221+0.004020  test-merror:0.086404+0.007274 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.001207+0.000487  test-merror:0.029429+0.005035
## 
## [1]  train-merror:0.073220 
## [71] train-merror:0.001766 
## 
## Process for: San Francisco, CA and Traverse City, MI 
## [1]  train-merror:0.125634+0.016090  test-merror:0.133368+0.019255 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [76] train-merror:0.002089+0.000530  test-merror:0.025545+0.002464
## 
## [1]  train-merror:0.112301 
## [76] train-merror:0.002354 
## 
## Process for: San Francisco, CA and Washington, DC 
## [1]  train-merror:0.114502+0.002235  test-merror:0.125623+0.006095 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [48] train-merror:0.007049+0.000756  test-merror:0.041398+0.003877
## 
## [1]  train-merror:0.122175 
## [48] train-merror:0.009279 
## 
## Process for: San Jose, CA and Seattle, WA 
## [1]  train-merror:0.172278+0.004475  test-merror:0.187303+0.009270 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.000642+0.000270  test-merror:0.022640+0.002224
## 
## [1]  train-merror:0.171082 
## [71] train-merror:0.001750 
## 
## Process for: San Jose, CA and Tampa Bay, FL 
## [1]  train-merror:0.069813+0.001528  test-merror:0.077617+0.005402 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [51] train-merror:0.005406+0.000745  test-merror:0.027352+0.001244
## 
## [1]  train-merror:0.071186 
## [51] train-merror:0.006897 
## 
## Process for: San Jose, CA and Traverse City, MI 
## [1]  train-merror:0.139864+0.005094  test-merror:0.151008+0.011796 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.008402+0.000687  test-merror:0.045046+0.002411
## 
## [1]  train-merror:0.144241 
## [71] train-merror:0.010620 
## 
## Process for: San Jose, CA and Washington, DC 
## [1]  train-merror:0.117743+0.003171  test-merror:0.125268+0.007953 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [83] train-merror:0.005175+0.000641  test-merror:0.043302+0.004652
## 
## [1]  train-merror:0.115394 
## [83] train-merror:0.005472 
## 
## Process for: Seattle, WA and Tampa Bay, FL 
## [1]  train-merror:0.075950+0.010073  test-merror:0.083694+0.011275 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [46] train-merror:0.001870+0.000326  test-merror:0.014611+0.003367
## 
## [1]  train-merror:0.057393 
## [46] train-merror:0.002104 
## 
## Process for: Seattle, WA and Traverse City, MI 
## [1]  train-merror:0.185057+0.005538  test-merror:0.196988+0.006037 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [71] train-merror:0.003355+0.000799  test-merror:0.046329+0.005323
## 
## [1]  train-merror:0.178784 
## [71] train-merror:0.004668 
## 
## Process for: Seattle, WA and Washington, DC 
## [1]  train-merror:0.187872+0.003079  test-merror:0.198071+0.011079 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [120]    train-merror:0.000030+0.000060  test-merror:0.016417+0.004899
## 
## [1]  train-merror:0.188080 
## [120]    train-merror:0.000000 
## 
## Process for: Tampa Bay, FL and Traverse City, MI 
## [1]  train-merror:0.070806+0.000703  test-merror:0.074225+0.004372 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [51] train-merror:0.000672+0.000353  test-merror:0.009351+0.003115
## 
## [1]  train-merror:0.068498 
## [51] train-merror:0.001403 
## 
## Process for: Tampa Bay, FL and Washington, DC 
## [1]  train-merror:0.133774+0.001078  test-merror:0.141090+0.015461 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [53] train-merror:0.009249+0.000864  test-merror:0.042233+0.003937
## 
## [1]  train-merror:0.143945 
## [53] train-merror:0.011777 
## 
## Process for: Traverse City, MI and Washington, DC 
## [1]  train-merror:0.259874+0.006179  test-merror:0.290623+0.013868 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## Stopping. Best iteration:
## [52] train-merror:0.028938+0.000779  test-merror:0.101355+0.008096
## 
## [1]  train-merror:0.243279 
## [52] train-merror:0.036640

Suppose then that the test data file is subset to only these 10 locales, and that each one vs. one prediction is run on the data subset:

# Prepare the test data
testOneOne_002 <- fullDataSplit$testData %>%
    select_at(vars(all_of(c(locXGBPreds, keepVarFull)))) %>%
    filter(locNamefct %in% oneoneLocs_002) %>%
    mutate_if(is.factor, .funs=fct_drop)

# Function to create the probabilities by locale
predictOneOneProbs_002 <- function(x) {
    helperXGBPredict(x$mdlResult$xgbModel, 
                     dfSparse=helperMakeSparse(testOneOne_002, 
                                               depVar="locNamefct", 
                                               predVars=locXGBPreds
                                               ), 
                     objective="multi:softprob", 
                     probMatrix=TRUE, 
                     yLevels=x$mdlResult$yTrainLevels
                     )$predData %>%
        mutate(rownum=row_number(), predicted=as.character(predicted))
}

# Extract from list
allOneOneProbs_002 <- map_dfr(localeOnevOne_002, .f=predictOneOneProbs_002, .id="Run")

# Make the prediction based on 1) most votes, then 2) highest probability when in majority
locOneOne_002 <- allOneOneProbs_002 %>%
    group_by(rownum, predicted) %>%
    summarize(n=n(), highProb=sum(probPredicted >= 0.9), majProb=sum(probPredicted)) %>%
    filter(n==max(n)) %>%
    ungroup() %>%
    arrange(rownum, -highProb, -majProb) %>%
    group_by(rownum) %>%
    filter(row_number()==1) %>%
    ungroup() %>%
    bind_cols(select(testOneOne_002, locNamefct, source, dtime))

# Overall accuracy by locale
locOneOne_002 %>%
    mutate(correct=predicted==locNamefct) %>%
    group_by(locNamefct) %>%
    summarize(pctCorrect=mean(correct), n=n()) %>%
    ggplot(aes(x=fct_reorder(locNamefct, pctCorrect), y=pctCorrect)) + 
    geom_point() + 
    geom_text(aes(y=pctCorrect+0.02, label=paste0(round(100*pctCorrect), "%")), hjust=0, size=3.5) + 
    coord_flip() + 
    labs(x="Actual Locale", 
         y="Percent Correctly Predicted", 
         title="Predictions based on Maximum Probability"
         ) + 
    ylim(c(0, 1.1))

# Confusion matrix
locOneOne_002 %>%
    mutate(name=factor(predicted)) %>%
    count(locNamefct, name) %>%
    group_by(locNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ungroup() %>%
    ggplot(aes(x=locNamefct, y=name)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%")), size=2.5) + 
    coord_flip() + 
    labs(x="Actual Locale", 
         y="Predicted Locale", 
         title="Predictions based on Maximum Probability"
         ) + 
    scale_fill_continuous(low="white", high="lightgreen") + 
    theme(axis.text.x=element_text(angle=90, hjust=1))

This approach increases the percentage of correct classifications from 44% to 55%. The cold weather cities in particular should be explored to see if accuracies can be further improved.

Suppose that the objective is changed to classifying cities according to some broad buckets based on the confusion matri outcomes:

  • Tampa, Houston, Miami - Humid
  • DC, Newark, Philadelphia - Mid-Atlantic
  • San Jose, San Francisco - Marine, N CA
  • San Diego, Los Angeles - Marine, S CA
  • Saint Louis, Dallas, Atlanta - South Central
  • Phoenix, Las Vegas - Desert
  • Traverse City, Chicago, Detroit, Grand Rapids, Green Bay, Madison, Milwaukee, Minneapolis - Wintry

Suppose also that the stand-alone cities that are generally well classified are excluded:

  • Seattle, San Antonio, New Orleans, Denver

Suppose also that the stand-alone cities that are broadly misclassified are excluded:

  • Indianapolis, Lincoln, Boston

A mapping file is created to convert locNamefct to groupNamefct:

mapLoc2Group <- c("Atlanta, GA"="South Central", 
                  "Boston, MA"="Exclude", 
                  "Chicago, IL"="Wintry", 
                  "Dallas, TX"="South Central", 
                  "Denver, CO"="Exclude", 
                  "Detroit, MI"="Wintry", 
                  "Grand Rapids, MI"="Wintry", 
                  "Green Bay, WI"="Wintry", 
                  "Houston, TX"="Humid", 
                  "Indianapolis, TX"="Exclude", 
                  "Las Vegas, NV"="Desert", 
                  "Lincoln, NE"="Exclude", 
                  "Los Angeles, CA"="Marine, S CA", 
                  "Madison, WI"="Wintry", 
                  "Miami, FL"="Humid", 
                  "Milwaukee, WI"="Wintry", 
                  "Minneapolis, MN"="Wintry", 
                  "New Orleans, LA"="Exclude", 
                  "Newark, NJ"="Mid-Atlantic", 
                  "Philadelphia, PA"="Mid-Atlantic", 
                  "Phoenix, AZ"="Desert", 
                  "Saint Louis, MO"="South Central", 
                  "San Antonio, TX"="Exclude", 
                  "San Diego, CA"="Marine, S CA", 
                  "San Francisco, CA"="Marine, N CA", 
                  "San Jose, CA"="Marine, N CA", 
                  "Seattle, WA"="Exclude", 
                  "Tampa Bay, FL"="Humid", 
                  "Traverse City, MI"="Wintry", 
                  "Washington, DC"="Mid-Atlantic"
                  )

The mapping file is tested to be sure that counts by group are as expected:

groupTrainData <- localeTrainData %>%
    mutate(groupNamefct=factor(mapLoc2Group[locNamefct]))

groupTestData <- fullDataSplit$testData %>%
    mutate(groupNamefct=factor(mapLoc2Group[locNamefct]))

groupTrainData %>%
    count(groupNamefct, locNamefct) %>%
    pivot_wider(locNamefct, names_from=groupNamefct, values_from=n) %>%
    as.data.frame()
##           locNamefct Desert Exclude Humid Marine, N CA Marine, S CA
## 1      Las Vegas, NV   6089      NA    NA           NA           NA
## 2        Phoenix, AZ   6078      NA    NA           NA           NA
## 3         Boston, MA     NA    6115    NA           NA           NA
## 4         Denver, CO     NA    6051    NA           NA           NA
## 5   Indianapolis, IN     NA    6176    NA           NA           NA
## 6        Lincoln, NE     NA    6124    NA           NA           NA
## 7    New Orleans, LA     NA    6101    NA           NA           NA
## 8    San Antonio, TX     NA    6049    NA           NA           NA
## 9        Seattle, WA     NA    6121    NA           NA           NA
## 10       Houston, TX     NA      NA  6150           NA           NA
## 11         Miami, FL     NA      NA  6148           NA           NA
## 12     Tampa Bay, FL     NA      NA  6111           NA           NA
## 13 San Francisco, CA     NA      NA    NA         6068           NA
## 14      San Jose, CA     NA      NA    NA         6121           NA
## 15   Los Angeles, CA     NA      NA    NA           NA         6048
## 16     San Diego, CA     NA      NA    NA           NA         6089
## 17        Newark, NJ     NA      NA    NA           NA           NA
## 18  Philadelphia, PA     NA      NA    NA           NA           NA
## 19    Washington, DC     NA      NA    NA           NA           NA
## 20       Atlanta, GA     NA      NA    NA           NA           NA
## 21        Dallas, TX     NA      NA    NA           NA           NA
## 22   Saint Louis, MO     NA      NA    NA           NA           NA
## 23       Chicago, IL     NA      NA    NA           NA           NA
## 24       Detroit, MI     NA      NA    NA           NA           NA
## 25  Grand Rapids, MI     NA      NA    NA           NA           NA
## 26     Green Bay, WI     NA      NA    NA           NA           NA
## 27       Madison, WI     NA      NA    NA           NA           NA
## 28     Milwaukee, WI     NA      NA    NA           NA           NA
## 29   Minneapolis, MN     NA      NA    NA           NA           NA
## 30 Traverse City, MI     NA      NA    NA           NA           NA
##    Mid-Atlantic South Central Wintry
## 1            NA            NA     NA
## 2            NA            NA     NA
## 3            NA            NA     NA
## 4            NA            NA     NA
## 5            NA            NA     NA
## 6            NA            NA     NA
## 7            NA            NA     NA
## 8            NA            NA     NA
## 9            NA            NA     NA
## 10           NA            NA     NA
## 11           NA            NA     NA
## 12           NA            NA     NA
## 13           NA            NA     NA
## 14           NA            NA     NA
## 15           NA            NA     NA
## 16           NA            NA     NA
## 17         6134            NA     NA
## 18         6109            NA     NA
## 19         6004            NA     NA
## 20           NA          6184     NA
## 21           NA          6108     NA
## 22           NA          6091     NA
## 23           NA            NA   6196
## 24           NA            NA   6139
## 25           NA            NA   6114
## 26           NA            NA   6095
## 27           NA            NA   5973
## 28           NA            NA   6086
## 29           NA            NA   6173
## 30           NA            NA   6169

The mappings appear as expected, and the CV process can be run on the mapped data:

# Run the CV process with a callback for early stopping if 5 iterations show no improvement
xgb_group_cv <- xgbRunModel_002(groupTrainData, 
                                depVar="groupNamefct", 
                                predVars=locXGBPreds, 
                                otherVars=keepVarFull, 
                                critFilterNot=list(groupNamefct="Exclude"),
                                seed=2008191330,
                                nrounds=1000,
                                print_every_n=25, 
                                xgbObjective="multi:softmax", 
                                funcRun=xgboost::xgb.cv, 
                                nfold=5, 
                                num_class=7, 
                                early_stopping_rounds=5
                                )
## [1]  train-merror:0.426384+0.002074  test-merror:0.433919+0.003670 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## [26] train-merror:0.251154+0.001785  test-merror:0.277554+0.003181 
## [51] train-merror:0.194780+0.001394  test-merror:0.235524+0.003321 
## [76] train-merror:0.159187+0.000587  test-merror:0.211514+0.001421 
## [101]    train-merror:0.132973+0.000386  test-merror:0.194897+0.001857 
## [126]    train-merror:0.112957+0.000567  test-merror:0.183792+0.002338 
## [151]    train-merror:0.097110+0.000576  test-merror:0.174548+0.002180 
## [176]    train-merror:0.084068+0.000354  test-merror:0.167938+0.002564 
## [201]    train-merror:0.072955+0.000759  test-merror:0.162090+0.002327 
## [226]    train-merror:0.063473+0.000740  test-merror:0.156233+0.002419 
## [251]    train-merror:0.054869+0.000475  test-merror:0.151443+0.002344 
## [276]    train-merror:0.047687+0.000272  test-merror:0.148219+0.002298 
## [301]    train-merror:0.041461+0.000867  test-merror:0.145199+0.002417 
## [326]    train-merror:0.036140+0.000574  test-merror:0.143094+0.002637 
## [351]    train-merror:0.031345+0.000776  test-merror:0.140602+0.002476 
## [376]    train-merror:0.027190+0.000901  test-merror:0.138396+0.001917 
## [401]    train-merror:0.023852+0.000639  test-merror:0.136484+0.001923 
## [426]    train-merror:0.020748+0.000637  test-merror:0.134470+0.002275 
## [451]    train-merror:0.018063+0.000647  test-merror:0.133362+0.002467 
## Stopping. Best iteration:
## [448]    train-merror:0.018361+0.000549  test-merror:0.133280+0.002461

The model can then be trained using 200 observations:

# Extract best n
bestN <- xgb_group_cv$xgbModel$best_iteration

# And the model can be run for that number of iterations
xgb_group <- xgbRunModel_002(groupTrainData, 
                             depVar="groupNamefct", 
                             predVars=locXGBPreds, 
                             otherVars=keepVarFull, 
                             critFilterNot=list(groupNamefct="Exclude"),
                             seed=2008191350,
                             nrounds=bestN,
                             print_every_n=25, 
                             xgbObjective="multi:softprob", 
                             funcRun=xgboost::xgboost, 
                             num_class=7
                             )
## [1]  train-merror:0.426089 
## [26] train-merror:0.252181 
## [51] train-merror:0.200673 
## [76] train-merror:0.164562 
## [101]    train-merror:0.140114 
## [126]    train-merror:0.121555 
## [151]    train-merror:0.105681 
## [176]    train-merror:0.092694 
## [201]    train-merror:0.082027 
## [226]    train-merror:0.072030 
## [251]    train-merror:0.063284 
## [276]    train-merror:0.056267 
## [301]    train-merror:0.050552 
## [326]    train-merror:0.045183 
## [351]    train-merror:0.040474 
## [376]    train-merror:0.034922 
## [401]    train-merror:0.031688 
## [426]    train-merror:0.028149 
## [448]    train-merror:0.025454

The assessment process can then be run on the trained model:

# ASSESSMENT 1: Variable Importance
xgb_allgroups_importance <- plotXGBImportance(xgb_group, 
                                              featureStems=locXGBPreds, 
                                              stemMapper = varMapper, 
                                              plotTitle="Gain by variable in xgboost", 
                                              plotSubtitle="Locale Group (2016)"
                                              )

# ASSESSMENT 2: Evolution of training error
plotXGBEvolution(xgb_group, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 448 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 0.426
##  2     2 train 0.393
##  3     3 train 0.382
##  4     4 train 0.369
##  5     5 train 0.357
##  6     6 train 0.350
##  7     7 train 0.344
##  8     8 train 0.335
##  9     9 train 0.328
## 10    10 train 0.320
## # ... with 438 more rows
# ASSESSMENT 3: Assess performance on test dataset
assessTestData(xgb_group, depVar="groupNamefct", reportBy="groupNamefct", isClassification=TRUE)

# ASSESSMENT 4: Prediction quality vs. confidence (currently only implemented for classification)
plotPredictionConfidencevQuality(xgb_group, depVar="groupNamefct", dataLim=5)

# ASSESSMENT 5: Performance on other data (currently only implemented for classification)
df_group_pred <- assessNonModelDataPredictions(mdl=xgb_group$xgbModel, 
                                               df=filter(groupTestData, 
                                                         !is.na(TempF), 
                                                         year==2016, 
                                                         !(groupNamefct %in% c("Exclude"))
                                                         ), 
                                               depVar="groupNamefct", 
                                               predVars=locXGBPreds, 
                                               yLevels=xgb_group$yTrainLevels, 
                                               ySortVar="None"
                                               )

Classifications on the holdout data are generally strong, with meaningful mismatches as follows:

  • Mid-Atlantic as Wintry (~15%)
  • South Central as Wintry (~10%)
  • Marine S vs Marine N (~5% in each direction)

Suppose two steps are taken to address this:

  • Collapse Marine S and Marine N to a single group, Marine
  • Cut the size of the Wintry test data by 50% so that it is functionally more like 4 city volumes of data rather than 8 city volumes of data
set.seed(2008191408)

groupTrainData_002 <- localeTrainData %>%
    mutate(groupName=mapLoc2Group[locNamefct], 
           groupNamefct=ifelse(str_detect(groupName, "Marine"), "Marine", groupName)
           ) %>%
    filter(groupNamefct != "Wintry" | rnorm(nrow(.)) >= 0)
    

groupTestData_002 <- fullDataSplit$testData %>%
    mutate(groupName=mapLoc2Group[locNamefct], 
           groupNamefct=ifelse(str_detect(groupName, "Marine"), "Marine", groupName)
           )

groupTrainData_002 %>%
    count(groupNamefct, locNamefct) %>%
    pivot_wider(locNamefct, names_from=groupNamefct, values_from=n) %>%
    as.data.frame()
##           locNamefct Desert Exclude Humid Marine Mid-Atlantic South Central
## 1      Las Vegas, NV   6089      NA    NA     NA           NA            NA
## 2        Phoenix, AZ   6078      NA    NA     NA           NA            NA
## 3         Boston, MA     NA    6115    NA     NA           NA            NA
## 4         Denver, CO     NA    6051    NA     NA           NA            NA
## 5   Indianapolis, IN     NA    6176    NA     NA           NA            NA
## 6        Lincoln, NE     NA    6124    NA     NA           NA            NA
## 7    New Orleans, LA     NA    6101    NA     NA           NA            NA
## 8    San Antonio, TX     NA    6049    NA     NA           NA            NA
## 9        Seattle, WA     NA    6121    NA     NA           NA            NA
## 10       Houston, TX     NA      NA  6150     NA           NA            NA
## 11         Miami, FL     NA      NA  6148     NA           NA            NA
## 12     Tampa Bay, FL     NA      NA  6111     NA           NA            NA
## 13   Los Angeles, CA     NA      NA    NA   6048           NA            NA
## 14     San Diego, CA     NA      NA    NA   6089           NA            NA
## 15 San Francisco, CA     NA      NA    NA   6068           NA            NA
## 16      San Jose, CA     NA      NA    NA   6121           NA            NA
## 17        Newark, NJ     NA      NA    NA     NA         6134            NA
## 18  Philadelphia, PA     NA      NA    NA     NA         6109            NA
## 19    Washington, DC     NA      NA    NA     NA         6004            NA
## 20       Atlanta, GA     NA      NA    NA     NA           NA          6184
## 21        Dallas, TX     NA      NA    NA     NA           NA          6108
## 22   Saint Louis, MO     NA      NA    NA     NA           NA          6091
## 23       Chicago, IL     NA      NA    NA     NA           NA            NA
## 24       Detroit, MI     NA      NA    NA     NA           NA            NA
## 25  Grand Rapids, MI     NA      NA    NA     NA           NA            NA
## 26     Green Bay, WI     NA      NA    NA     NA           NA            NA
## 27       Madison, WI     NA      NA    NA     NA           NA            NA
## 28     Milwaukee, WI     NA      NA    NA     NA           NA            NA
## 29   Minneapolis, MN     NA      NA    NA     NA           NA            NA
## 30 Traverse City, MI     NA      NA    NA     NA           NA            NA
##    Wintry
## 1      NA
## 2      NA
## 3      NA
## 4      NA
## 5      NA
## 6      NA
## 7      NA
## 8      NA
## 9      NA
## 10     NA
## 11     NA
## 12     NA
## 13     NA
## 14     NA
## 15     NA
## 16     NA
## 17     NA
## 18     NA
## 19     NA
## 20     NA
## 21     NA
## 22     NA
## 23   3064
## 24   3121
## 25   3021
## 26   3055
## 27   3030
## 28   3048
## 29   3135
## 30   3059
groupTrainData_002 %>%
    count(groupNamefct)
## # A tibble: 7 x 2
##   groupNamefct      n
##   <chr>         <int>
## 1 Desert        12167
## 2 Exclude       42737
## 3 Humid         18409
## 4 Marine        24326
## 5 Mid-Atlantic  18247
## 6 South Central 18383
## 7 Wintry        24533

The model is then trained as before, starting with CV:

# Run the CV process with a callback for early stopping if 5 iterations show no improvement
xgb_group_cv_002 <- xgbRunModel_002(groupTrainData_002, 
                                    depVar="groupNamefct", 
                                    predVars=locXGBPreds, 
                                    otherVars=keepVarFull, 
                                    critFilterNot=list(groupNamefct="Exclude"),
                                    seed=2008191415,
                                    nrounds=1000,
                                    print_every_n=25, 
                                    xgbObjective="multi:softmax", 
                                    funcRun=xgboost::xgb.cv, 
                                    nfold=5, 
                                    num_class=6, 
                                    early_stopping_rounds=5
                                    )
## [1]  train-merror:0.419217+0.002723  test-merror:0.428563+0.005814 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## [26] train-merror:0.231931+0.002269  test-merror:0.263336+0.002946 
## [51] train-merror:0.179166+0.001272  test-merror:0.223642+0.002386 
## [76] train-merror:0.142998+0.000989  test-merror:0.200268+0.002369 
## [101]    train-merror:0.117181+0.001253  test-merror:0.185511+0.002919 
## [126]    train-merror:0.097946+0.001641  test-merror:0.174790+0.003887 
## [151]    train-merror:0.083515+0.001728  test-merror:0.166458+0.003569 
## [176]    train-merror:0.071096+0.001553  test-merror:0.159934+0.002914 
## [201]    train-merror:0.061028+0.001478  test-merror:0.154974+0.003307 
## [226]    train-merror:0.052643+0.001431  test-merror:0.150445+0.002800 
## [251]    train-merror:0.045159+0.000812  test-merror:0.146617+0.003029 
## [276]    train-merror:0.038528+0.000896  test-merror:0.143429+0.003019 
## [301]    train-merror:0.033303+0.000773  test-merror:0.140905+0.003094 
## [326]    train-merror:0.028832+0.000648  test-merror:0.138604+0.002760 
## [351]    train-merror:0.025094+0.000740  test-merror:0.136696+0.002428 
## Stopping. Best iteration:
## [369]    train-merror:0.022521+0.000320  test-merror:0.135010+0.002347

The best number of rounds can again be extracted, and a model trained:

# Extract best n
bestN <- xgb_group_cv_002$xgbModel$best_iteration

# And the model can be run for that number of iterations
xgb_group_002 <- xgbRunModel_002(groupTrainData_002, 
                                 depVar="groupNamefct", 
                                 predVars=locXGBPreds, 
                                 otherVars=keepVarFull, 
                                 critFilterNot=list(groupNamefct="Exclude"),
                                 seed=2008191430,
                                 nrounds=bestN,
                                 print_every_n=25, 
                                 xgbObjective="multi:softprob", 
                                 funcRun=xgboost::xgboost, 
                                 num_class=6
                                 )
## [1]  train-merror:0.417719 
## [26] train-merror:0.238781 
## [51] train-merror:0.186421 
## [76] train-merror:0.153374 
## [101]    train-merror:0.128671 
## [126]    train-merror:0.107488 
## [151]    train-merror:0.092411 
## [176]    train-merror:0.080324 
## [201]    train-merror:0.070280 
## [226]    train-merror:0.060791 
## [251]    train-merror:0.052692 
## [276]    train-merror:0.045947 
## [301]    train-merror:0.040765 
## [326]    train-merror:0.035953 
## [351]    train-merror:0.031706 
## [369]    train-merror:0.029072

The assessment process can then be run on the trained model:

# ASSESSMENT 1: Variable Importance
xgb_allgroups_importance_002 <- plotXGBImportance(xgb_group_002, 
                                                  featureStems=locXGBPreds, 
                                                  stemMapper = varMapper, 
                                                  plotTitle="Gain by variable in xgboost", 
                                                  plotSubtitle="Locale Group (2016)"
                                                  )

# ASSESSMENT 2: Evolution of training error
plotXGBEvolution(xgb_group_002, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 369 x 3
##     iter type  error
##    <dbl> <chr> <dbl>
##  1     1 train 0.418
##  2     2 train 0.392
##  3     3 train 0.375
##  4     4 train 0.363
##  5     5 train 0.355
##  6     6 train 0.346
##  7     7 train 0.338
##  8     8 train 0.328
##  9     9 train 0.317
## 10    10 train 0.310
## # ... with 359 more rows
# ASSESSMENT 3: Assess performance on test dataset
assessTestData(xgb_group_002, depVar="groupNamefct", reportBy="groupNamefct", isClassification=TRUE)

# ASSESSMENT 4: Prediction quality vs. confidence (currently only implemented for classification)
plotPredictionConfidencevQuality(xgb_group_002, depVar="groupNamefct", dataLim=5)

# ASSESSMENT 5: Performance on other data (currently only implemented for classification)
df_group_pred_002 <- assessNonModelDataPredictions(mdl=xgb_group_002$xgbModel, 
                                                   df=filter(groupTestData_002, 
                                                             !is.na(TempF), 
                                                             year==2016, 
                                                             !(groupNamefct %in% c("Exclude"))
                                                             ), 
                                                   depVar="groupNamefct", 
                                                   predVars=locXGBPreds, 
                                                   yLevels=xgb_group_002$yTrainLevels, 
                                                   ySortVar="None"
                                                   )

Overall error levels are comparable. With Wintry no longer being the dominant class, Wintry moves down to 85% accuracy, but fewer observations are inaccurately classified as Wintry. Marine and Desert are almost always correctly classified.

The exclude data are run through the classifier to see where they fall:

# ASSESSMENT 5: Performance on other data (currently only implemented for classification)
df_group_pred_exclude_002 <- assessNonModelDataPredictions(mdl=xgb_group_002$xgbModel, 
                                                           df=filter(groupTestData_002, 
                                                                     !is.na(TempF), 
                                                                     year==2016, 
                                                                     (groupNamefct %in% c("Exclude"))
                                                                     ), 
                                                           depVar="locNamefct", 
                                                           predVars=locXGBPreds, 
                                                           yLevels=xgb_group_002$yTrainLevels, 
                                                           ySortVar="Wintry"
                                                           )

Classifications are broadly as expected based on previous analyses.

A next area for exploration is hyperparameter tuning, including eta (learning rate) and depth. An initial model is run with a significant increase in depth, from the default of 6 to a new attempt of 50. The CV process will again be run with the goal of avoiding overfitting:

# Same data and approach as previous, but with maximum depth of 50 (instead of default 6)
# Run the CV process with a callback for early stopping if 5 iterations show no improvement
xgb_group_cv_003 <- xgbRunModel_002(groupTrainData_002, 
                                    depVar="groupNamefct", 
                                    predVars=locXGBPreds, 
                                    otherVars=keepVarFull, 
                                    critFilterNot=list(groupNamefct="Exclude"),
                                    seed=2008191415,
                                    nrounds=1000,
                                    print_every_n=25, 
                                    xgbObjective="multi:softmax", 
                                    funcRun=xgboost::xgb.cv, 
                                    nfold=5, 
                                    num_class=6, 
                                    early_stopping_rounds=5, 
                                    max_depth=50
                                    )
## [1]  train-merror:0.098869+0.000780  test-merror:0.267853+0.004920 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## [26] train-merror:0.000012+0.000012  test-merror:0.152328+0.002940 
## [51] train-merror:0.000000+0.000000  test-merror:0.138727+0.002553 
## [76] train-merror:0.000000+0.000000  test-merror:0.132708+0.002311 
## [101]    train-merror:0.000000+0.000000  test-merror:0.129089+0.002119 
## [126]    train-merror:0.000000+0.000000  test-merror:0.127366+0.002206 
## [151]    train-merror:0.000000+0.000000  test-merror:0.126086+0.002512 
## [176]    train-merror:0.000000+0.000000  test-merror:0.124720+0.002461 
## [201]    train-merror:0.000000+0.000000  test-merror:0.123661+0.002779 
## Stopping. Best iteration:
## [200]    train-merror:0.000000+0.000000  test-merror:0.123600+0.002742

The best number of rounds can again be extracted, and a model trained:

# Extract best n
bestN <- xgb_group_cv_003$xgbModel$best_iteration

# And the model can be run for that number of iterations
xgb_group_003 <- xgbRunModel_002(groupTrainData_002, 
                                 depVar="groupNamefct", 
                                 predVars=locXGBPreds, 
                                 otherVars=keepVarFull, 
                                 critFilterNot=list(groupNamefct="Exclude"),
                                 seed=2008191430,
                                 nrounds=bestN,
                                 print_every_n=25, 
                                 xgbObjective="multi:softprob", 
                                 funcRun=xgboost::xgboost, 
                                 num_class=6, 
                                 max_depth=60
                                 )
## [1]  train-merror:0.094282 
## [26] train-merror:0.000000 
## [51] train-merror:0.000000 
## [76] train-merror:0.000000 
## [101]    train-merror:0.000000 
## [126]    train-merror:0.000000 
## [151]    train-merror:0.000000 
## [176]    train-merror:0.000000 
## [200]    train-merror:0.000000

The assessment process can then be run on the trained model:

# ASSESSMENT 1: Variable Importance
xgb_allgroups_importance_003 <- plotXGBImportance(xgb_group_003, 
                                                  featureStems=locXGBPreds, 
                                                  stemMapper = varMapper, 
                                                  plotTitle="Gain by variable in xgboost", 
                                                  plotSubtitle="Locale Group (2016)"
                                                  )

# ASSESSMENT 2: Evolution of training error
plotXGBEvolution(xgb_group_003, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 200 x 3
##     iter type    error
##    <dbl> <chr>   <dbl>
##  1     1 train 0.0943 
##  2     2 train 0.0558 
##  3     3 train 0.0349 
##  4     4 train 0.0229 
##  5     5 train 0.0156 
##  6     6 train 0.0105 
##  7     7 train 0.00710
##  8     8 train 0.00496
##  9     9 train 0.00324
## 10    10 train 0.00206
## # ... with 190 more rows
# ASSESSMENT 3: Assess performance on test dataset
assessTestData(xgb_group_003, depVar="groupNamefct", reportBy="groupNamefct", isClassification=TRUE)

# ASSESSMENT 4: Prediction quality vs. confidence (currently only implemented for classification)
plotPredictionConfidencevQuality(xgb_group_003, depVar="groupNamefct", dataLim=5)

# ASSESSMENT 5: Performance on other data (currently only implemented for classification)
df_group_pred_003 <- assessNonModelDataPredictions(mdl=xgb_group_003$xgbModel, 
                                                   df=filter(groupTestData_002, 
                                                             !is.na(TempF), 
                                                             year==2016, 
                                                             !(groupNamefct %in% c("Exclude"))
                                                             ), 
                                                   depVar="groupNamefct", 
                                                   predVars=locXGBPreds, 
                                                   yLevels=xgb_group_003$yTrainLevels, 
                                                   ySortVar="None"
                                                   )

There is roughly a 2% increase in classification accuracy, though the model now tends towards overconfidence in its predictions. Next steps are to 1) integrate the more distinct stand-alone locales, and 2) further assess the impact of eta and max_depth.

As a next step, the locales of Seattle, San Antonio, Denver, and New Orleans are added back to the classification. These locales generally have strong stand-alone classification, so the goal is to see whether that holds even in comparison to other archetypes.

  • Add Denver, New Orleans, San Antonio, Seattle as stand-alone
  • Consolidate to Marine as per previous
  • Reduce Wintry to half-sampling (functionally, the equivalent of 4 locales of data)

Example code includes:

mapLoc2Group_004 <- c("Atlanta, GA"="South Central", 
                      "Boston, MA"="Exclude", 
                      "Chicago, IL"="Wintry", 
                      "Dallas, TX"="South Central", 
                      "Denver, CO"="Denver", 
                      "Detroit, MI"="Wintry", 
                      "Grand Rapids, MI"="Wintry", 
                      "Green Bay, WI"="Wintry", 
                      "Houston, TX"="Humid", 
                      "Indianapolis, TX"="Exclude", 
                      "Las Vegas, NV"="Desert", 
                      "Lincoln, NE"="Exclude", 
                      "Los Angeles, CA"="Marine", 
                      "Madison, WI"="Wintry", 
                      "Miami, FL"="Humid", 
                      "Milwaukee, WI"="Wintry", 
                      "Minneapolis, MN"="Wintry", 
                      "New Orleans, LA"="New Orleans", 
                      "Newark, NJ"="Mid-Atlantic", 
                      "Philadelphia, PA"="Mid-Atlantic", 
                      "Phoenix, AZ"="Desert", 
                      "Saint Louis, MO"="South Central", 
                      "San Antonio, TX"="San Antonio", 
                      "San Diego, CA"="Marine", 
                      "San Francisco, CA"="Marine", 
                      "San Jose, CA"="Marine", 
                      "Seattle, WA"="Seattle", 
                      "Tampa Bay, FL"="Humid", 
                      "Traverse City, MI"="Wintry", 
                      "Washington, DC"="Mid-Atlantic"
                      )

set.seed(2008201259)

groupTrainData_004 <- localeTrainData %>%
    mutate(groupName=mapLoc2Group_004[locNamefct], groupNamefct=factor(groupName)) %>%
    filter(groupNamefct != "Wintry" | rnorm(nrow(.)) >= 0)

groupTestData_004 <- fullDataSplit$testData %>%
    mutate(groupName=mapLoc2Group_004[locNamefct], groupNamefct=factor(groupName)) %>%
    filter(groupNamefct != "Wintry" | rnorm(nrow(.)) >= 0)

groupTrainData_004 %>%
    count(groupNamefct, locNamefct) %>%
    pivot_wider(locNamefct, names_from=groupNamefct, values_from=n) %>%
    as.data.frame()
##           locNamefct Denver Desert Exclude Humid Marine Mid-Atlantic
## 1         Denver, CO   6051     NA      NA    NA     NA           NA
## 2      Las Vegas, NV     NA   6089      NA    NA     NA           NA
## 3        Phoenix, AZ     NA   6078      NA    NA     NA           NA
## 4         Boston, MA     NA     NA    6115    NA     NA           NA
## 5   Indianapolis, IN     NA     NA    6176    NA     NA           NA
## 6        Lincoln, NE     NA     NA    6124    NA     NA           NA
## 7        Houston, TX     NA     NA      NA  6150     NA           NA
## 8          Miami, FL     NA     NA      NA  6148     NA           NA
## 9      Tampa Bay, FL     NA     NA      NA  6111     NA           NA
## 10   Los Angeles, CA     NA     NA      NA    NA   6048           NA
## 11     San Diego, CA     NA     NA      NA    NA   6089           NA
## 12 San Francisco, CA     NA     NA      NA    NA   6068           NA
## 13      San Jose, CA     NA     NA      NA    NA   6121           NA
## 14        Newark, NJ     NA     NA      NA    NA     NA         6134
## 15  Philadelphia, PA     NA     NA      NA    NA     NA         6109
## 16    Washington, DC     NA     NA      NA    NA     NA         6004
## 17   New Orleans, LA     NA     NA      NA    NA     NA           NA
## 18   San Antonio, TX     NA     NA      NA    NA     NA           NA
## 19       Seattle, WA     NA     NA      NA    NA     NA           NA
## 20       Atlanta, GA     NA     NA      NA    NA     NA           NA
## 21        Dallas, TX     NA     NA      NA    NA     NA           NA
## 22   Saint Louis, MO     NA     NA      NA    NA     NA           NA
## 23       Chicago, IL     NA     NA      NA    NA     NA           NA
## 24       Detroit, MI     NA     NA      NA    NA     NA           NA
## 25  Grand Rapids, MI     NA     NA      NA    NA     NA           NA
## 26     Green Bay, WI     NA     NA      NA    NA     NA           NA
## 27       Madison, WI     NA     NA      NA    NA     NA           NA
## 28     Milwaukee, WI     NA     NA      NA    NA     NA           NA
## 29   Minneapolis, MN     NA     NA      NA    NA     NA           NA
## 30 Traverse City, MI     NA     NA      NA    NA     NA           NA
##    New Orleans San Antonio Seattle South Central Wintry
## 1           NA          NA      NA            NA     NA
## 2           NA          NA      NA            NA     NA
## 3           NA          NA      NA            NA     NA
## 4           NA          NA      NA            NA     NA
## 5           NA          NA      NA            NA     NA
## 6           NA          NA      NA            NA     NA
## 7           NA          NA      NA            NA     NA
## 8           NA          NA      NA            NA     NA
## 9           NA          NA      NA            NA     NA
## 10          NA          NA      NA            NA     NA
## 11          NA          NA      NA            NA     NA
## 12          NA          NA      NA            NA     NA
## 13          NA          NA      NA            NA     NA
## 14          NA          NA      NA            NA     NA
## 15          NA          NA      NA            NA     NA
## 16          NA          NA      NA            NA     NA
## 17        6101          NA      NA            NA     NA
## 18          NA        6049      NA            NA     NA
## 19          NA          NA    6121            NA     NA
## 20          NA          NA      NA          6184     NA
## 21          NA          NA      NA          6108     NA
## 22          NA          NA      NA          6091     NA
## 23          NA          NA      NA            NA   3164
## 24          NA          NA      NA            NA   3125
## 25          NA          NA      NA            NA   3022
## 26          NA          NA      NA            NA   3053
## 27          NA          NA      NA            NA   3016
## 28          NA          NA      NA            NA   3026
## 29          NA          NA      NA            NA   3121
## 30          NA          NA      NA            NA   3145
groupTrainData_004 %>%
    count(groupNamefct)
## # A tibble: 11 x 2
##    groupNamefct      n
##    <fct>         <int>
##  1 Denver         6051
##  2 Desert        12167
##  3 Exclude       18415
##  4 Humid         18409
##  5 Marine        24326
##  6 Mid-Atlantic  18247
##  7 New Orleans    6101
##  8 San Antonio    6049
##  9 Seattle        6121
## 10 South Central 18383
## 11 Wintry        24672

The eta will be maintained at the default with the maximum depth set to 25. The CV process is run, with early stopping to determine the number of rounds:

# Same data and approach as previous, but with maximum depth of 25 (instead of default 6)
# Run the CV process with a callback for early stopping if 5 iterations show no improvement
xgb_group_cv_004 <- xgbRunModel_002(groupTrainData_004, 
                                    depVar="groupNamefct", 
                                    predVars=locXGBPreds, 
                                    otherVars=keepVarFull, 
                                    critFilterNot=list(groupNamefct="Exclude"),
                                    seed=2008201313,
                                    nrounds=1000,
                                    print_every_n=25, 
                                    xgbObjective="multi:softmax", 
                                    funcRun=xgboost::xgb.cv, 
                                    nfold=5, 
                                    num_class=10, 
                                    early_stopping_rounds=5, 
                                    max_depth=25
                                    )
## [1]  train-merror:0.182089+0.001636  test-merror:0.345834+0.004751 
## Multiple eval metrics are present. Will use test_merror for early stopping.
## Will train until test_merror hasn't improved in 5 rounds.
## 
## [26] train-merror:0.000089+0.000025  test-merror:0.185080+0.003501 
## [51] train-merror:0.000000+0.000000  test-merror:0.162136+0.003150 
## [76] train-merror:0.000000+0.000000  test-merror:0.151797+0.002760 
## [101]    train-merror:0.000000+0.000000  test-merror:0.146247+0.002543 
## [126]    train-merror:0.000000+0.000000  test-merror:0.142353+0.002586 
## [151]    train-merror:0.000000+0.000000  test-merror:0.139537+0.002048 
## [176]    train-merror:0.000000+0.000000  test-merror:0.138053+0.001619 
## [201]    train-merror:0.000000+0.000000  test-merror:0.136284+0.001616 
## [226]    train-merror:0.000000+0.000000  test-merror:0.135074+0.001760 
## Stopping. Best iteration:
## [240]    train-merror:0.000000+0.000000  test-merror:0.134312+0.001780

The best number of rounds can again be extracted, and a model trained:

# CV Error Evolution
plotXGBEvolution(xgb_group_cv_004, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 490 x 3
##     iter type   error
##    <dbl> <chr>  <dbl>
##  1     1 train 0.182 
##  2     1 test  0.346 
##  3     2 train 0.116 
##  4     2 test  0.307 
##  5     3 train 0.0782
##  6     3 test  0.285 
##  7     4 train 0.0542
##  8     4 test  0.270 
##  9     5 train 0.0389
## 10     5 test  0.260 
## # ... with 480 more rows
# Extract best n
bestN <- xgb_group_cv_004$xgbModel$best_iteration

# And the model can be run for that number of iterations
xgb_group_004 <- xgbRunModel_002(groupTrainData_004, 
                                 depVar="groupNamefct", 
                                 predVars=locXGBPreds, 
                                 otherVars=keepVarFull, 
                                 critFilterNot=list(groupNamefct="Exclude"),
                                 seed=2008201330,
                                 nrounds=bestN,
                                 print_every_n=25, 
                                 xgbObjective="multi:softprob", 
                                 funcRun=xgboost::xgboost, 
                                 num_class=10, 
                                 max_depth=25
                                 )
## [1]  train-merror:0.173522 
## [26] train-merror:0.000051 
## [51] train-merror:0.000000 
## [76] train-merror:0.000000 
## [101]    train-merror:0.000000 
## [126]    train-merror:0.000000 
## [151]    train-merror:0.000000 
## [176]    train-merror:0.000000 
## [201]    train-merror:0.000000 
## [226]    train-merror:0.000000 
## [240]    train-merror:0.000000

The assessment process can then be run on the trained model:

# ASSESSMENT 1: Variable Importance
xgb_allgroups_importance_004 <- plotXGBImportance(xgb_group_004, 
                                                  featureStems=locXGBPreds, 
                                                  stemMapper = varMapper, 
                                                  plotTitle="Gain by variable in xgboost", 
                                                  plotSubtitle="Locale Group (2016)"
                                                  )

# ASSESSMENT 2: Evolution of training error
plotXGBEvolution(xgb_group_004, isRegression=FALSE, label_every=NULL, yLim=c(0, NA), show_line=TRUE)

## # A tibble: 240 x 3
##     iter type    error
##    <dbl> <chr>   <dbl>
##  1     1 train 0.174  
##  2     2 train 0.109  
##  3     3 train 0.0746 
##  4     4 train 0.0525 
##  5     5 train 0.0381 
##  6     6 train 0.0283 
##  7     7 train 0.0209 
##  8     8 train 0.0155 
##  9     9 train 0.0118 
## 10    10 train 0.00906
## # ... with 230 more rows
# ASSESSMENT 3: Assess performance on test dataset
assessTestData(xgb_group_004, depVar="groupNamefct", reportBy="groupNamefct", isClassification=TRUE)

# ASSESSMENT 4: Prediction quality vs. confidence (currently only implemented for classification)
plotPredictionConfidencevQuality(xgb_group_004, depVar="groupNamefct", dataLim=5)

# ASSESSMENT 5: Performance on other data (currently only implemented for classification)
df_group_pred_004 <- assessNonModelDataPredictions(mdl=xgb_group_004$xgbModel, 
                                                   df=filter(groupTestData_004, 
                                                             !is.na(TempF), 
                                                             year==2016, 
                                                             !(groupNamefct %in% c("Exclude"))
                                                             ), 
                                                   depVar="groupNamefct", 
                                                   predVars=locXGBPreds, 
                                                   yLevels=xgb_group_004$yTrainLevels, 
                                                   ySortVar="None"
                                                   )

The model achieves 88% accuracy, including ~90% accuracy for each of Denver, San Antonio, and Seattle. New Orleans is at ~80% accuracy with ~10% misclassified as New Orleans. The model tends towards overly confident predictions (probabilities higher than achieved).

Suppose the fully out of sample data are separated in to records with 97.5%+ probability and all other:

# Distribution of predictions
df_group_pred_004 %>%
    filter(groupNamefct != "Exclude") %>%
    mutate(highProb=probPredicted >= 0.975) %>%
    ggplot(aes(x=fct_reorder(groupNamefct, highProb, .fun=mean), fill=highProb)) + 
    geom_bar(position="fill") + 
    labs(x="Actual Archetype", 
         y="Percent of Predictions", 
         title="A majority of predictions have probability >= 97.5+"
         ) + 
    scale_fill_discrete("Predicted Probability >= 97.5%?") + 
    coord_flip() +
    theme(legend.position="bottom")

# Accuracy rates when probability >= 97.5%
df_group_pred_004 %>%
    filter(probPredicted >= 0.975, groupNamefct != "Exclude") %>%
    mutate(correct=fct_drop(predicted)==fct_drop(groupNamefct)) %>%
    group_by(groupNamefct) %>%
    summarize(n=n(), pct=sum(correct)/n()) %>%
    ggplot(aes(x=fct_reorder(groupNamefct, pct), y=pct)) + 
    geom_point() + 
    geom_text(aes(y=pct+0.03, label=paste0(round(100*pct), "%"))) +
    coord_flip() +
    ylim(c(0, NA)) +
    labs(x="Actual Archetype", 
         y="Percent Assigned to Correct Archetype", 
         title="Almost all highly confident (97.5%+) predictions assign the proper archetype"
         )

# Accuracy rates when probability >= 97.5%
df_group_pred_004 %>%
    filter(probPredicted >= 0.975, groupNamefct != "Exclude") %>%
    mutate(correct=fct_drop(predicted)==fct_drop(groupNamefct)) %>%
    group_by(locNamefct) %>%
    summarize(n=n(), pct=sum(correct)/n()) %>%
    ggplot(aes(x=fct_reorder(locNamefct, pct), y=pct)) + 
    geom_point() + 
    geom_text(aes(y=pct+0.03, label=paste0(round(100*pct), "%"))) +
    coord_flip() +
    ylim(c(0, NA)) +
    labs(x="Actual Locale", 
         y="Percent Assigned to Correct Archetype", 
         title="Almost all highly confident (97.5%+) predictions assign the proper archetype"
         )

# Accuracy rates when probability < 97.5%
df_group_pred_004 %>%
    filter(probPredicted < 0.975, groupNamefct != "Exclude") %>%
    mutate(correct=fct_drop(predicted)==fct_drop(groupNamefct)) %>%
    group_by(groupNamefct) %>%
    summarize(n=n(), pct=sum(correct)/n()) %>%
    ggplot(aes(x=fct_reorder(groupNamefct, pct), y=pct)) + 
    geom_point() + 
    geom_text(aes(y=pct+0.03, label=paste0(round(100*pct), "%"))) +
    coord_flip() +
    ylim(c(0, 1)) +
    labs(x="Actual Archetype", 
         y="Percent Assigned to Correct Archetype", 
         title="The correct archetype is assigned ~70% of the time for less confident (97.5%-) predictions"
         )

# Accuracy rates when probability < 97.5%
df_group_pred_004 %>%
    filter(probPredicted < 0.975, groupNamefct != "Exclude") %>%
    mutate(correct=fct_drop(predicted)==fct_drop(groupNamefct)) %>%
    group_by(locNamefct) %>%
    summarize(n=n(), pct=sum(correct)/n()) %>%
    ggplot(aes(x=fct_reorder(locNamefct, pct), y=pct)) + 
    geom_point() + 
    geom_text(aes(y=pct+0.03, label=paste0(round(100*pct), "%"))) +
    coord_flip() +
    ylim(c(0, 1)) +
    labs(x="Actual Locale", 
         y="Percent Assigned to Correct Archetype", 
         title="The correct archetype is assigned ~70% of the time for less confident (97.5%-) predictions"
         )

# Confusion matrix for less confident predictions
df_group_pred_004 %>%
    filter(probPredicted < 0.975, groupNamefct != "Exclude") %>%
    mutate(predicted=fct_drop(predicted), groupNamefct=fct_drop(groupNamefct)) %>%
    count(groupNamefct, predicted) %>%
    group_by(groupNamefct) %>%
    mutate(pct=n/sum(n)) %>%
    ggplot(aes(y=groupNamefct, x=predicted)) + 
    geom_tile(aes(fill=pct)) + 
    geom_text(aes(label=paste0(round(100*pct), "%"))) +
    labs(y="Actual Archetype", 
         x="Predicted Archetype", 
         title="Confusion matrix for predictions with probability < 97.5%"
         ) + 
    scale_fill_continuous("", low="white", high="green")

High probability predictions have ~99% accuracy while lower probability predictions average ~70% accuracy. The higher overall accuracy observed in Desert, Denver, and Marine are due to a higher proportion of predictions for these archetypes being highly confident (the archetypes are well-differentiated).

The predicted probability matrix from the modeled test data can inform investigation of “how close” the correct locale is when there is a classification error:

xgb_group_004$predData %>%
    mutate(rowNumber=row_number()) %>%
    pivot_longer(-rowNumber) %>%
    inner_join(mutate(xgb_group_004$testData, rowNumber=row_number())) %>%
    filter(groupNamefct!=predicted, name==groupNamefct) %>%
    mutate(delta=probPredicted-value) %>%
    ggplot(aes(x=delta)) + 
    geom_histogram() + 
    facet_wrap(~groupNamefct) + 
    labs(title="Predicted probability for the correct archetype is often very low", 
         y="", x="Difference in maximum predicted probability and predicted probability for correct archetype"
         )
## Joining, by = "rowNumber"
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Next steps are to explore these specific observations to see if something systemic is causing so many “very low” predictions for the correct archetype.